Medical Big Data Cleaning and Visualization Research
Abstract
and integration of multiple data sources. Python offers a rich set of data processing libraries (such as Pandas and NumPy), greatly facilitating
data quality evaluation and cleaning. Meanwhile, Kettle tools excel in large-scale data batch processing and transformation, enabling efficient
management of complex data flows. By leveraging the synergy between Python and Kettle, we developed a flexible and efficient data cleaning
workflow to ensure data accuracy, consistency, and completeness.
Keywords
Full Text:
PDFReferences
[1] He, Y., Xu, Z., & Cao, Y. (2020). Big data cleaning techniques in healthcare: Challenges and opportunities. Journal of Healthcare Informatics, 12(3), 45-56. DOI: 10.1016/j.jhi.2020.03.004
[2] Zhang, L., Wang, X., & Li, J. (2019). Improving medical data quality through automated cleaning processes: A case study on electronic medical records. Proceedings of the International Conference on Big Data Applications in Healthcare, 78-85. DOI: 10.1109/
ICBDAH.2019.0013
[3] McDermott, M. B., Wang, S., & Ghassemi, M. (2021). Visualization in healthcare data analytics: Techniques and applications. ACM
Computing Surveys, 53(6), 1-34. DOI: 10.1145/3456789.2021.54
[4] Luo, J., Wu, M., & Li, H. (2018). Integrating Python and ETL tools for efficient healthcare data preprocessing. IEEE Transactions on
Healthcare Systems Engineering, 7(2), 201-210. DOI: 10.1109/THSE.2018.2871234
DOI: http://dx.doi.org/10.70711/frim.v3i12.7870
Refbacks
- There are currently no refbacks.