TY - GEN
T1 - Change discovery in heterogeneous data sources of a data warehouse
AU - Solodovņikova, Darja
AU - Niedrīte, Laila
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2020
Y1 - 2020
N2 - Data warehouses have been used to analyze data stored in relational databases for several decades. However, over time, data that are employed in the decision-making process have become so enormous and heterogeneous that traditional data warehousing solutions have become unusable. Therefore, new big data technologies have emerged to deal with large volumes of data. The problem of structural evolution of integrated heterogeneous data sources has become extremely topical due to dynamic and diverse nature of big data. In this paper, we propose an approach to change discovery in data sources of a data warehouse utilized to analyze big data. Our solution incorporates an architecture that allows to perform OLAP operations and other kinds of analysis on integrated big data and is able to detect changes in schemata and other characteristics of structured, semi-structured and unstructured data sources. We discuss the algorithm for change discovery and metadata necessary for its operation.
AB - Data warehouses have been used to analyze data stored in relational databases for several decades. However, over time, data that are employed in the decision-making process have become so enormous and heterogeneous that traditional data warehousing solutions have become unusable. Therefore, new big data technologies have emerged to deal with large volumes of data. The problem of structural evolution of integrated heterogeneous data sources has become extremely topical due to dynamic and diverse nature of big data. In this paper, we propose an approach to change discovery in data sources of a data warehouse utilized to analyze big data. Our solution incorporates an architecture that allows to perform OLAP operations and other kinds of analysis on integrated big data and is able to detect changes in schemata and other characteristics of structured, semi-structured and unstructured data sources. We discuss the algorithm for change discovery and metadata necessary for its operation.
KW - Big data
KW - Data warehouse
KW - Evolution
KW - Metadata
UR - https://link.springer.com/chapter/10.1007%252F978-3-030-57672-1_3
UR - https://www.scopus.com/pages/publications/85089724430
U2 - 10.1007/978-3-030-57672-1_3
DO - 10.1007/978-3-030-57672-1_3
M3 - Conference paper
SN - 9783030576714
VL - 1243 CCIS
T3 - Communications in Computer and Information Science
SP - 23
EP - 37
BT - Databases and Information Systems - 14th International Baltic Conference, DB and IS 2020, Proceedings
A2 - Robal, Tarmo
A2 - Haav, Hele-Mai
A2 - Penjam, Jaan
A2 - Matulevicius, Raimundas
ER -