Integration Guide
Best practices and process for no-ETL integration
Introduction
This guide provides essential steps for integrating with Kubit's Self-Service Analytics. It outlines various methods to share your data with Kubit and configure your metrics accurately. Additionally, it offers best practices for data model design and implementation.
Integration Approaches
1. Data Share
Enable Kubit to securely access your cloud data warehouse directly, ensuring no data is copied or transferred between accounts. This method eliminates the need for ETL or batch jobs and extra storage. Kubit delivers real-time insights from live data, maintaining a Single Source of Truth. For more details, visit the Setup a Data Share page.
2. Direct Connect
Kubit connects directly to your data warehouse through a secure connection, providing real-time data access and control. This approach maintains a Single Source of Truth. For more information, see the Direct Connect page.
3. CDP Integration
If you use a Customer Data Platform (CDP) like Segment, Snowplow, Rudderstack, or mParticle, configure a Snowflake Destination to a Kubit account to centralize your analytical data. Kubit covers compute and storage costs for Self-Service Analytics.
Process
- Discuss Project Scope: Define use cases, KPIs, metrics, and dimensions with Kubit. (Duration: 1-2 hours)
- Share Data Access: Choose an integration method to share data with Kubit. (Duration: 1 hour - 2 days, depending on method)
- Log into Kubit: Integrate Single Sign-on (SSO) or provide user access lists.
- Data Validation: Collaborate with a Customer Success Manager to ensure data accuracy and requirement fulfillment.
About Your Data
Ensure your data is stored in data warehouse tables and updated frequently. Here are key considerations and best practices for integration.
Data Model
A data model, often depicted as an ER diagram, captures relationships between database entities. Avoid sharing schema-less transactional data for analytics. A star schema is recommended for storing analytics data.
- Fact Tables: Events, Transactions (e.g., Purchases, Subscriptions)
- Dimension Tables: Users, Campaigns, Attributions
If your data model differs, collaborate with data engineers for solutions.
Fact Table Recommendations
Fact tables record events with inline properties (e.g., timestamp, country) and dimension keys (e.g., user_id). Use inline properties to simplify queries and enhance performance. Modern warehouses like Snowflake optimize storage of duplicated values efficiently.
ETL implementation for star schema population is beyond this guide's scope. Contact Kubit support for assistance.
Recommended Data Structures
User Identification
Every user, registered or anonymous, requires a unique identifier for analytics. Common identifiers include:
- Device ID: Use Advertiser ID due to privacy concerns.
- User ID: Generate upon app launch for all users.
- Account ID: Use sparingly for multi-app analysis.
- Credentials: Avoid storing user credentials in analytics data.
Events
Events, such as Login or Page View, are critical for analytics. Maintain a data dictionary to define each event's name, trigger, properties, owner, and version history.
Instrumentation
Instrument events in both mobile and server code. Use a CDP like Segment for SDKs and data control. Effective communication with engineers is crucial to avoid errors.
Properties
Events have Common and Event Properties. Common properties include:
- Timestamp: Event occurrence time.
- Date Parts: Store separately for analytics.
- Device and User IDs: Identifiers for analytics.
- App Details: App ID, version, and locale.
Event properties should be generic and managed via a data dictionary.
Timezone
Store time data as timestamp or date types without timezone concepts. Use a consistent timezone for analytics, typically where the headquarters is located.
Define KPIs
Provide a list of top metrics for product analytics, such as:
- Engagement: DAU, MAU
- Retention: D2, W2 Retention
- Activity:
Likes / DAU - Monetization: ARPU
- Funnel: Registration, Purchase
- Performance: Download Failure Rate
Use DAU or MAU as denominators for rate calculations.
Dimensions
Define dimensions needed for analytics, specifying event properties or dimension tables for joins.
Cohorts
Define user cohorts for breakdowns in reports, predictions, and integrations.
Security
Data Security
Kubit ensures data security by not storing or altering your data. Analytics are performed on aggregated data, and all communication is encrypted. Kubit's infrastructure enforces strict security measures.
User Management
Kubit uses Auth0 for user security. Enable Single Sign-On with enterprise identity services or provide user emails for direct provisioning. MFA can be enabled for added security.
Updated 18 days ago