Data Management

Data management is the overall management of data acquiring, storing, validating and security in an enterprise. It helps organizations manage information knowledge and answer questions, such as: What do we know about our information? Where did this data come from?

The Good Things of Data Management

Big data

Able to manage big data giving organization the customer insights and new business opportunities

Data warehouse

provide an ability to generate report and analyst from multiple sources in order to make rational decision for a business

Integration

Combine data from different sources and visualize them in a single view

Data Cleansing

Multiple processes that prepare and organize data so that it can be proceeded efficiently

Usecase

“Data management for Health Checkup Program in Health Care Industry"

Challenge

One of our clients is in the healthcare industry, having more than one million patients. Our client aimed to provide a proper medical checkup package to their patients by using machine learning. The client wanted to analyze every patient individually. We have provided a web application or a user interface to the client in order to personalize a medical checkup for a single patient.

Key Success

After we gathered a requirement of the project, we received the data that needed to be features of an analysis model. We looked into the data and checked if there was anything to be taken care of by using Python programming language. The process called cleansing data. In python, the model that we used was the classification model. This model will result in the most relevant medical checkup package to a patient promptly. The most important process of machine learning is preparing and cleansing the data which usually takes time up to 80% of the total in a project. Quality data will result in quality output.

Methodology

Useful tools that can make everything easy for your business

Python

Developing algorithm to prepare, organize, analyze, and visualize data to users

Hadoop

Using multiple cluster to store and analyze big data

PostgreSQL

Using famous Opensource Database for a business that store structured data

MongoDB

Using well knowed Opensource Database for a business that store unstructured data

Elastic Stack

The acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline. Kibana lets users visualize data.

IBM Cloud