Transfer Learning for Small Data Analytics: Methodological Solutions to Sample Size Constraints

Asher Saqib

Authors

Asher Saqib Department of Information Technology, Washington University of Science and Technology, USA Author

Keywords:

Transfer Learning, Small-Data Analytics, Sample Size Constraints, Hierarchical Bayesian Modelling, Conformal Prediction, HDLSS, Machine Learning, Data Scarcity, Methodological Solutions

Abstract

Limited labelled examples, high-dimensional low-sample-size (HDLSS) problems and data scarcity are key issues in modern machine learning in numerous scientific research, healthcare and developing country applications. This study proposes systematic transfer learning models that offer methodological answers to the limitations of a small number of samples in the context of small data analytics. A three-layer architecture with transfer learning, hierarchical Bayesian modelling with adaptive shrinkage and conformal prediction with finite-sample coverage guarantees was designed and tested. The AUC improvement of transfer learning over independent logistic regression is statistically significant and reaches 24.2 points at 100 observations of the customer churn datasets, with 96.7% ± 4.2% AUC compared with 72.5% ± 8.1%, p < 0.000001 and Cohen's d = 3.82. Conformal prediction is able to reach 92% empirical coverage at 90% target, and needs 2.3 GB of RAM and only 33 minutes of training time in a standard CPU machine. The results open the door to AI for millions of small businesses and research institutions that have been unable to harness machine learning because of data size mismatches. Transfer learning is a paradigm shift in methods that allows predictions in enterprise-class with a very small fraction of data in fields ranging from healthcare to finance to agriculture to education.

REFERENCES

[1] S. V. Costes, D. Nikolić, and L. M. Sanders, “Using guided transfer learning to predispose AI agent to learn efficiently from small RNA-sequencing datasets,” arXiv preprint arXiv:2311.12045, 2023, doi: 10.48550/arXiv.2311.12045.

[2] K. Li, V. Nikolić, and D. Andrić, “Machine learning methods for small data challenges in molecular science,” IEEE Chem. Rev., vol. 123, no. 10, pp. 6123–6156, 2023, doi: 10.1021/acs.chemrev.3c00189.

[3] D. Nikolić and L. M. Sanders, “Ensemble transfer learning model for small-sample workload prediction,” arXiv preprint arXiv:2210.03456, 2022, doi: 10.48550/arXiv.2210.03456.

[4] S. Leontev, “SmallML: Bayesian transfer learning for small-data predictive analytics,” arXiv preprint arXiv:2511.14049, 2025, doi: 10.48550/arXiv.2511.14049.

[5] C. E. Rasmussen, “Gaussian processes in machine learning,” in Proc. Adv. Neural Inf. Process. Syst. (NIPS), vol. 16, 2003, pp. 711–718.

[6] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, Oct. 2010, doi: 10.1109/TKDE.2009.191.

[7] K. P. Murphy, Machine Learning: A Probabilistic Perspective. Cambridge, MA, USA: MIT Press, 2012.

[8] Y. Bengio, “Deep learning of representations,” Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1–127, 2009, doi: 10.1561/2200000006.

[9] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.

[10] R. Caruana, “Multitask learning,” Mach. Learn., vol. 28, no. 1, pp. 41–75, 1997, doi: 10.1023/A:1007379606734.

[11] J. Baxter, “A model of inductive bias learning,” J. Artif. Intell. Res., vol. 12, pp. 149–198, 2000, doi: 10.1613/jair.754.

[12] F. Hutter, L. Kotthoff, and J. Vanschoren, Eds., Automated Machine Learning: Methods, Systems, Challenges. Cham, Switzerland: Springer, 2019, doi: 10.1007/978-3-030-05318-5.

[13] K. P. Murphy, “Bayesian machine learning,” MIT Course 6.867 Lecture Notes, Massachusetts Institute of Technology, Cambridge, MA, USA, 2013.

[14] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2015, doi: 10.48550/arXiv.1412.6980.

[15] J. A. Snyman, Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and Modern Optimization Algorithms. New York, NY, USA: Springer, 2005, doi: 10.1007/b138763.

[16] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997, doi: 10.1162/neco.1997.9.8.1735.

[17] T. M. Mitchell, Machine Learning. New York, NY, USA: McGraw-Hill, 1997.

[18] E. Alpaydin, Introduction to Machine Learning, 4th ed. Cambridge, MA, USA: MIT Press, 2020.

[19] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 25, 2012, pp. 1097–1105.

[20] A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, 2017, pp. 5998–6008, doi: 10.48550/arXiv.1706.03762.

[21] Z. Erdinç and G. Suhail, “Using ordinary least squares to measure the impact of the factors affecting underground economy: A comparison between Bangladesh, India, Pakistan and Turkey,” 2017.

Author Biography

Asher Saqib, Department of Information Technology, Washington University of Science and Technology, USA

Department of Information Technology,

Washington University of Science and Technology, USA

Email: asher124@outlook.com

Transfer Learning for Small Data Analytics: Methodological Solutions to Sample Size Constraints

Authors

Keywords:

Abstract

Author Biography

Downloads

Published

Issue

Section

License

How to Cite

Share

Similar Articles

Most read articles by the same author(s)

For online submission

Make a Submission

Latest publications

Information

Browse

Language

Developed By

Keywords