RDBMS databases are powerful systems for storing, querying, and retrieving data - that is, as long as you can force the data into a model you don't need to change very often. However, modern data systems, especially health data, need flexible schemas and more advanced models that don't fit into the legacy database models. Additionally, RDBMS were not designed to support the scale and flexibility that modern health analytics demand. Today's approach to writing analytics present three major problems, making their implementation very brittle:
- The SQL code for an analytic is tightly locked to the representation of the data. If the data source or data structure change then the analytic logic breaks.
- When new data is required or when a different view of the data is needed, the RDMS schema must be redesigned, potentially breaking the application logic.
- Implementations are slow and serial, bottlenecked by traditional RDBMS and reporting technologies, and limited to the speed of a single server and query optimizations.
New technologies such as Hadoop, No SQL, and distributed computing are changing the technology landscape for processing large data sets. With No SQL database technologies, such as the MongoDB document-oriented database used in Apervita, developers can easily upload new and frequently changing data with no prior knowledge of the facts and source data structure. In fact, Apervita does away with a data schema altogether. Apervita introduces the concept of late binding, where data only needs to be connected to the analytic at runtime through Connectors.