Skip to main content

Data connectors

PurposeThis guide is designed to be your resource for understanding what are data connectors, and why are they important. You will explore the different advantages they offer,and we will take a walk through the different types of data connectors.
CreatedNovember 27, 2023

Understanding Data Connectors

Data Connectors play a role in facilitating easy communication and data exchange among different software applications, databases, and systems. In essence, they act as orchestrators, extracting data from various sources and transporting it to designated locations, often referred to as data warehouses.

Functionality

  • Data Extraction: Retrieving data from source.

  • Data Transformation: Transform the data, altering its format or structure.

  • Data Loading: Moving the data into another location.

  • Authentication and Security: Secure the process, ensuring that only the appropriate data gains access.

  • Monitoring and Logging: Keeping an eye on things, and if something goes wrong, it is monitored and an alert will be send out when configured.

Benefits of Using Data Connectors

Incorporating Data Connectors into your development toolkit yields several advantages:

  • Data Integration: Effortlessly merges data from disparate sources.

  • Informed Decision-Making: Enables more informed decisions by utilizing data from multiple sources.

  • Automation: Streamlines data integration processes, reducing manual intervention.

  • Standardization: Establishes a standardized framework for handling data from different sources.

Data Retrieval Methods

When designing a Data Connector, developers can choose between two primary retrieval methods:

  1. Snapshot: Gathers all data from the source, including both new and existing data.

  2. Incremental: Collects only newly accumulated data since the last run of the connector.

Data Loading Modes

Three primary modes exist for loading data into a destination system:

  • Replace: Replaces all data in the destination system with each load execution. Existing data correlating with the source dataset is entirely replaced.

    Note: Replace mode is often implemented with snapshot data extraction due to the entire dataset being transferred.

  • Append: Adds the new dataset to the existing data in the destination system without deleting current data.

    Note: Append mode is commonly used with incremental data extract to avoid transferring the entire dataset.

  • Merge: Reconciles and combines data extracted from the source system into the destination dataset. New data is added, and existing data that has changed in the source is updated in the destination.

    Note: Merge mode is preferred to preserve changes in existing data and prevent duplication.

Types of Data Connectors

Data ConnectorDescriptionUse Case
Database or Data WarehouseRelational, NoSQL, transactional, and analytical databases.Extract data from internal business applications and data systems.
Application Programming Interface (API)Endpoints exposed by software applications, enabling data querying and extraction.Extract data from internal or external software applications.
Flat File TransferFlat files (CSV) for exporting data from legacy applications.Integration with legacy systems lacking modern interfaces, such as REST APIs.
Cloud Object Storagecloud storage storage.Cloud-enabled data sources with globally distributed data destinations.
Event QueuesModern data mechanisms in microservice applications.Microservice-based applications publishing events to queues.
Event StreamsMechanisms where applications expose data streams observed by connectors.Applications generating real-time streaming data.
Internet of Things (IoT) ConnectorsDistributed fleets of devices transferring data to central hubs.Devices in both internal business and consumer contexts for monitoring and diagnostics.

Incorporating these Data Connectors into your development projects ensures a streamlined and interconnected data environment. By understanding their types, benefits, and usage scenarios, developers can effectively implement data integration solutions tailored to their specific needs.