Hard to Kill: Why Auto-Increment Primary Keys Can Make Data Sync Die Harder

by Joche Ojeda | Jan 22, 2025 | ADO, ADO.NET, C#, Data Synchronization, EfCore, XPO, XPO Database Replication

Working with the SyncFramework, I’ve noticed a recurring pattern when discussing schema design with customers. One crucial question that often surprises them is about their choice of primary keys: “Are you using auto-incremental integers or unique identifiers (like GUIDs)?”

Approximately 90% of users rely on auto-incremental integer primary keys. While this seems like a straightforward choice, it can create significant challenges for data synchronization. Let’s dive deep into how different database engines handle auto-increment values and why this matters for synchronization scenarios.

Database Implementation Deep Dive

SQL Server

SQL Server uses the IDENTITY property, storing current values in system tables (sys.identity_columns) and caching them in memory for performance. During restarts, it reads the last used value from these system tables. The values are managed as 8-byte numbers internally, with new ranges allocated when the cache is exhausted.

MySQL

MySQL’s InnoDB engine maintains auto-increment counters in memory and persists them to the system tablespace or table’s .frm file. After a restart, it scans the table to find the maximum used value. Each table has its own counter stored in the metadata.

PostgreSQL

PostgreSQL takes a different approach, using separate sequence objects stored in the pg_class catalog. These sequences maintain their own relation files containing crucial metadata like last value, increment, and min/max values. The sequence data is periodically checkpointed to disk for durability.

Oracle

Oracle traditionally uses sequences and triggers, with modern versions (12c+) supporting identity columns. The sequence information is stored in the SEQ$ system table, tracking the last number used, cache size, and increment values.

The Synchronization Challenge

This diversity in implementation creates several challenges for data synchronization:

Unpredictable Sequence Generation: Even within the same database engine, gaps can occur due to rolled-back transactions or server restarts.
Infrastructure Dependencies: The mechanisms for generating next values are deeply embedded within each database engine and aren’t easily accessible to frameworks like Entity Framework or XPO.
Cross-Database Complexity: When synchronizing across different database instances, coordinating auto-increment values becomes even more complex.

The GUID Alternative

Using GUIDs (Globally Unique Identifiers) as primary keys offers a solution to these synchronization challenges. While GUIDs come with their own set of considerations, they provide guaranteed uniqueness across distributed systems without requiring centralized coordination.

Traditional GUID Concerns

Index fragmentation
Storage size
Performance impact

Modern Solutions

These concerns have been addressed through:

Sequential GUID generation techniques
Improved indexing in modern databases
Optimizations in .NET 9

Recommendations

When designing systems that require data synchronization:

Consider using GUIDs instead of auto-increment integers for primary keys
Evaluate sequential GUID generation for better performance
Understand that auto-increment values, while simple, can complicate synchronization scenarios
Plan for the infrastructure needed to maintain consistent primary key generation across your distributed system

Conclusion

The choice of primary key strategy significantly impacts your system’s ability to handle data synchronization effectively. While auto-increment integers might seem simpler at first, understanding their implementation details across different databases reveals why GUIDs often provide a more robust solution for distributed systems.

Remember: Data synchronization is not a trivial problem, and your primary key strategy plays a crucial role in its success. Take the time to evaluate your requirements and choose the appropriate approach for your specific use case.

Till next time, happy delta encoding.

SyncFramework for XPO: Updated for .NET 8 & 9 and DevExpress 24.2.3!

by Joche Ojeda | Jan 22, 2025 | ADO.NET, C#, Data Synchronization, Database, DevExpress, XPO, XPO Database Replication

SyncFramework for XPO is a specialized implementation of our delta encoding synchronization library, designed specifically for DevExpress XPO users. It enables efficient data synchronization by tracking and transmitting only the changes between data versions, optimizing both bandwidth usage and processing time.

What’s New

Base target framework updated to .NET 8.0
Added compatibility with .NET 9.0
Updated DevExpress XPO dependencies to 24.2.3
Continued support for delta encoding synchronization
Various performance improvements and bug fixes

Framework Compatibility

Primary Target: .NET 8.0
Additional Support: .NET 9.0

Our XPO implementation continues to serve the DevExpress community.

Key Features

Seamless integration with DevExpress XPO
Efficient delta-based synchronization
Support for multiple database providers
Cross-platform compatibility
Easy integration with existing XPO and XAF applications

As always, if you own a license, you can compile the source code yourself from our GitHub repository. The framework maintains its commitment to providing reliable data synchronization for XPO applications.

Happy Delta Encoding! ?

SyncFramework Update: Now Supporting .NET 9 and EfCore 9!

by Joche Ojeda | Jan 21, 2025 | ADO.NET, C#, Data Synchronization, EfCore

SyncFramework Update: Now Supporting .NET 9!

SyncFramework is a C# library that simplifies data synchronization using delta encoding technology. Instead of transferring entire datasets, it efficiently synchronizes by tracking and transmitting only the changes between data versions, significantly reducing bandwidth and processing overhead.

What’s New

All packages now target .NET 9
BIT.Data.Sync packages updated to support the latest framework
Entity Framework Core packages upgraded to EF Core 9
Various minor fixes and improvements

Available Implementations

SyncFramework for XPO: For DevExpress XPO users
SyncFramework for Entity Framework Core: For EF Core users

Package Statistics

Our packages have been serving the community well, with steady adoption:

BIT.Data.Sync: 2,142 downloads
BIT.Data.Sync.AspNetCore: 1,064 downloads
BIT.Data.Sync.AspNetCore.Xpo: 521 downloads
BIT.Data.Sync.EfCore: 1,691 downloads
BIT.Data.Sync.EfCore.Npgsql: 1,120 downloads
BIT.Data.Sync.EfCore.Pomelo.MySql: 1,172 downloads
BIT.Data.Sync.EfCore.Sqlite: 887 downloads
BIT.Data.Sync.EfCore.SqlServer: 982 downloads

Resources

NuGet Packages
Source Code

As always, you can compile the source code yourself from our GitHub repository. The framework continues to provide reliable data synchronization across different platforms and databases.

Happy Delta Encoding! ?

ADOMD.NET: Beyond Rows and Columns – The Multidimensional Evolution of ADO.NET

by Joche Ojeda | Jan 20, 2025 | ADO, ADO.NET, Database, dotnet

When I first encountered the challenge of migrating hundreds of Visual Basic 6 reports to .NET, I never imagined it would lead me down a path of discovering specialized data analytics tools. Today, I want to share my experience with ADOMD.NET and how it could have transformed our reporting challenges, even though we couldn’t implement it due to our database constraints.

The Challenge: The Sales Gap Report

The story begins with a seemingly simple report called “Sales Gap.” Its purpose was critical: identify periods when regular customers stopped purchasing specific items. For instance, if a customer typically bought 10 units monthly from January to May, then suddenly stopped in June and July, sales representatives needed to understand why.

This report required complex queries across multiple transactional tables:

Invoicing
Sales
Returns
Debits
Credits

Initially, the report took about a minute to run. As our data grew, so did the execution time—eventually reaching an unbearable 15 minutes. We were stuck with a requirement to use real-time transactional data, making traditional optimization techniques like data warehousing off-limits.

Enter ADOMD.NET: A Specialized Solution

ADOMD.NET (ActiveX Data Objects Multidimensional .NET) emerged as a potential solution. Here’s why it caught my attention:

Key Features:

Multidimensional Analysis

Unlike traditional SQL queries, ADOMD.NET uses MDX (Multidimensional Expressions), specifically designed for analytical queries. Here’s a basic example:

string mdxQuery = @"
    SELECT 
        {[Measures].[Sales Amount]} ON COLUMNS,
        {[Date].[Calendar Year].MEMBERS} ON ROWS
    FROM [Sales Cube]
    WHERE [Product].[Category].[Electronics]";

Performance Optimization

ADOMD.NET is built for analytical workloads, offering better performance for complex calculations and aggregations. It achieves this through:
- Specialized data structures for multidimensional analysis
- Efficient handling of hierarchical data
- Built-in support for complex calculations

Advanced Analytics Capabilities

The tool supports sophisticated analysis patterns like:

string mdxQuery = @"
    WITH MEMBER [Measures].[GrowthVsPreviousYear] AS
        ([Measures].[Sales Amount] - 
        ([Measures].[Sales Amount], [Date].[Calendar Year].PREVMEMBER)
        )/([Measures].[Sales Amount], [Date].[Calendar Year].PREVMEMBER)
    SELECT 
        {[Measures].[Sales Amount], [Measures].[GrowthVsPreviousYear]} 
    ON COLUMNS...";

Lessons Learned

While we couldn’t implement ADOMD.NET due to our use of Pervasive Database instead of SQL Server, the investigation taught me valuable lessons about report optimization:

The importance of choosing the right tools for analytical workloads
The limitations of running complex analytics on transactional databases
The value of specialized query languages for different types of data analysis

Modern Applications

Today, ADOMD.NET continues to be relevant for organizations using:

SQL Server Analysis Services (SSAS)
Azure Analysis Services
Power BI Premium datasets

If I were facing the same challenge today with SQL Server, ADOMD.NET would be my go-to solution for:

Complex sales analysis
Customer behavior tracking
Performance-intensive analytical reports

Conclusion

While our specific situation with Pervasive Database prevented us from using ADOMD.NET, it remains a powerful tool for organizations using Microsoft’s analytics stack. The experience taught me that sometimes the solution isn’t about optimizing existing queries, but about choosing the right specialized tools for analytical workloads.

Remember: Just because you can run analytics on your transactional database doesn’t mean you should. Tools like ADOMD.NET exist for a reason, and understanding when to use them can save countless hours of optimization work and provide better results for your users.

The AnyCPU Illusion: Native Dependencies in .NET Applications

by Joche Ojeda | Jan 12, 2025 | ADO.NET, C#, CPU, dotnet, ORM, XAF, XPO

Introduction

In the .NET ecosystem, “AnyCPU” is often considered a silver bullet for cross-platform deployment. However, this assumption can lead to significant problems when your application depends on native assemblies. In this post, I want to share a personal story that highlights how I discovered these limitations and how native dependencies affect the true portability of AnyCPU applications, especially for database access through ADO.NET and popular ORMs.

My Journey to Understanding AnyCPU’s Limitations

Every year, around Thanksgiving or Christmas, I visit my friend, brother, and business partner Javier. Two years ago, during one of these visits, I made a decision that would lead me to a pivotal realization about AnyCPU architecture.

At the time, I was tired of traveling with my bulky MSI GE72 Apache Pro-24 gaming laptop. According to MSI’s official specifications, it weighed 5.95 pounds—but that number didn’t include the hefty charger, which brought the total to around 12 pounds. Later, I upgraded to an MSI GF63 Thin, which was lighter at 4.10 pounds—but with the charger, it was still around 7.5 pounds. Lugging these laptops through airports felt like a workout.

Determined to travel lighter, I purchased a MacBook Air with the M2 chip. At just 2.7 pounds, including the charger, the MacBook Air felt like a breath of fresh air. The Apple Silicon chip was incredibly fast, and I immediately fell in love with the machine.

Having used a MacBook Pro with Bootcamp and Windows 7 years ago, I thought I could recreate that experience by running a Windows virtual machine on my MacBook Air to check projects and do some light development while traveling.

The Virtualization Experiment

As someone who loves virtualization, I eagerly set up a Windows virtual machine on my MacBook Air. I grabbed my trusty Windows x64 ISO, set up the virtual machine, and attempted to boot it—but it failed. I quickly realized the issue was related to CPU architecture. My x64 ISO wasn’t compatible with the ARM-based M2 chip.

Undeterred, I downloaded a Windows 11 ISO for ARM architecture and created the VM. Success! Windows was up and running, and I installed Visual Studio along with my essential development tools, including DevExpress XPO (my favorite ORM).

The Demo Disaster

The real test came during a trip to Dubai, where I was scheduled to give a live demo showcasing how quickly you can develop Line-of-Business (LOB) apps with XAF. Everything started smoothly until I tried to connect my XAF app to the database. Despite my best efforts, the connection failed.

In the middle of the demo, I switched to an in-memory data provider to salvage the presentation. After the demo, I dug into the issue and realized the root cause was related to the CPU architecture. The native database drivers I was using weren’t compatible with the ARM architecture.

A Familiar Problem

This situation reminded me of the transition from x86 to x64 years ago. Back then, I encountered similar issues where native drivers wouldn’t load unless they matched the process architecture.

The Native Dependency Challenge

Platform-Specific Loading Requirements

Native DLLs must exactly match the CPU architecture of your application:

If your app runs as x86, it can only load x86 native DLLs.
If running as x64, it requires x64 native DLLs.
ARM requires ARM-specific binaries.
ARM64 requires ARM64-specific binaries.

There is no flexibility—attempting to load a DLL compiled for a different architecture results in an immediate failure.

How Native Libraries are Loaded

When your application loads a native DLL, the operating system follows a specific search pattern:

The application’s directory
System directories (System32/SysWOW64)
Directories listed in the PATH environment variable

Crucially, these native libraries must match the exact architecture of the running process.

// This seemingly simple code
[DllImport("native.dll")]
static extern void NativeMethod();

// Actually requires:
// - native.dll compiled for x86 when running as 32-bit
// - native.dll compiled for x64 when running as 64-bit
// - native.dll compiled for ARM64 when running on ARM64

The SQL Server Example

Let’s look at SQL Server connectivity, a common scenario where the AnyCPU illusion breaks down:

// Traditional ADO.NET connection
using (var connection = new SqlConnection(connectionString))
{
    // This requires SQL Native Client
    // Which must match the process architecture
    await connection.OpenAsync();
}

Even though your application is compiled as AnyCPU, the SQL Native Client must match the process architecture. This becomes particularly problematic on newer architectures like ARM64, where native drivers may not be available.

Impact on ORMs

Entity Framework Core

Entity Framework Core, despite its modern design, still relies on database providers that may have native dependencies:

public class MyDbContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        // This configuration depends on:
        // 1. SQL Native Client
        // 2. Microsoft.Data.SqlClient native components
        optionsBuilder.UseSqlServer(connectionString);
    }
}

DevExpress XPO

DevExpress XPO faces similar challenges:

// XPO configuration
string connectionString = MSSqlConnectionProvider.GetConnectionString("server", "database");
XpoDefault.DataLayer = XpoDefault.GetDataLayer(connectionString, AutoCreateOption.DatabaseAndSchema);

// The MSSqlConnectionProvider relies on the same native SQL Server components

Solutions and Best Practices

1. Architecture-Specific Deployment

Instead of relying on AnyCPU, consider creating architecture-specific builds:

<PropertyGroup>
    <Platforms>x86;x64;arm64</Platforms>
    <RuntimeIdentifiers>win-x86;win-x64;win-arm64</RuntimeIdentifiers>
</PropertyGroup>

2. Runtime Provider Selection

Implement smart provider selection based on the current architecture:

public static class DatabaseProviderFactory
{
    public static IDbConnection GetProvider()
    {
        return RuntimeInformation.ProcessArchitecture switch
        {
            Architecture.X86 => new SqlConnection(), // x86 native provider
            Architecture.X64 => new SqlConnection(), // x64 native provider
            Architecture.Arm64 => new Microsoft.Data.SqlClient.SqlConnection(), // ARM64 support
            _ => throw new PlatformNotSupportedException()
        };
    }
}

3. Managed Fallbacks

Implement fallback strategies when native providers aren’t available:

public class DatabaseConnection
{
    public async Task<IDbConnection> CreateConnectionAsync()
    {
        try
        {
            var connection = new SqlConnection(_connectionString);
            await connection.OpenAsync();
            return connection;
        }
        catch (DllNotFoundException)
        {
            var managedConnection = new Microsoft.Data.SqlClient.SqlConnection(_connectionString);
            await managedConnection.OpenAsync();
            return managedConnection;
        }
    }
}

4. Deployment Considerations

Include all necessary native dependencies for each target architecture.
Use architecture-specific directories in your deployment.
Consider self-contained deployment to include the correct runtime.

Real-World Implications

This experience taught me that while AnyCPU provides excellent flexibility for managed code, it has limitations when dealing with native dependencies. These limitations become more apparent in scenarios like cloud deployments, ARM64 devices, and live demos.

Conclusion

The transition to ARM architecture is accelerating, and understanding the nuances of AnyCPU and native dependencies is more important than ever. By planning for architecture-specific deployments and implementing fallback strategies, you can build more resilient applications that can thrive in a multi-architecture world.

User-Defined Functions in SQLite: Enhancing SQL with Custom C# Procedures

by Joche Ojeda | Jan 15, 2024 | ADO.NET, C#, Database, Sqlite

SQLite, known for its simplicity and lightweight architecture, offers unique opportunities for developers to integrate custom functions directly into their applications. Unlike most databases that require learning an SQL dialect for procedural programming, SQLite operates in-process with your application. This design choice allows developers to define functions using their application’s programming language, enhancing the database’s flexibility and functionality.

Scalar Functions

Scalar functions in SQLite are designed to return a single value for each row in a query. Developers can define new scalar functions or override built-in ones using the CreateFunction method. This method supports various data types for parameters and return types, ensuring versatility in function creation. Developers can specify the state argument to pass a consistent value across all function invocations, avoiding the need for closures. Additionally, marking a function as isDeterministic optimizes query compilation by SQLite if the function’s output is predictable based on its input.

Example: Adding a Scalar Function


connection.CreateFunction(
    "volume",
    (double radius, double height) => Math.PI * Math.Pow(radius, 2) * height);

var command = connection.CreateCommand();
command.CommandText = @"
    SELECT name,
           volume(radius, height) AS volume
    FROM cylinder
    ORDER BY volume DESC
";

Operators

SQLite implements several operators as scalar functions. Defining these functions in your application overrides the default behavior of these operators. For example, functions like glob, like, and regexp can be custom-defined to change the behavior of their corresponding operators in SQL queries.

Example: Defining the regexp Function


connection.CreateFunction(
    "regexp",
    (string pattern, string input) => Regex.IsMatch(input, pattern));

var command = connection.CreateCommand();
command.CommandText = @"
    SELECT count()
    FROM user
    WHERE bio REGEXP '\w\. {2,}\w'
";
var count = command.ExecuteScalar();

Aggregate Functions

Aggregate functions return a consolidated value from multiple rows. Using CreateAggregate, developers can define and override these functions. The seed argument sets the initial context state, and the func argument is executed for each row. The resultSelector parameter, if specified, calculates the final result from the context after processing all rows.

Example: Creating an Aggregate Function for Standard Deviation


connection.CreateAggregate(
    "stdev",
    (Count: 0, Sum: 0.0, SumOfSquares: 0.0),
    ((int Count, double Sum, double SumOfSquares) context, double value) => {
        context.Count++;
        context.Sum += value;
        context.SumOfSquares += value * value;
        return context;
    },
    context => {
        var variance = context.SumOfSquares - context.Sum * context.Sum / context.Count;
        return Math.Sqrt(variance / context.Count);
    });

var command = connection.CreateCommand
();
command.CommandText = @"
SELECT stdev(gpa)
FROM student
";
var stdDev = command.ExecuteScalar();

Errors

When a user-defined function throws an exception in SQLite, the message is returned to the database engine, which then raises an error. Developers can customize the SQLite error code by throwing a SqliteException with a specific SqliteErrorCode.

Debugging

SQLite directly invokes the implementation of user-defined functions, allowing developers to insert breakpoints and leverage the full .NET debugging experience. This integration facilitates debugging and enhances the development of robust, error-free custom functions.

This article illustrates the power and flexibility of SQLite’s approach to user-defined functions, demonstrating how developers can extend the functionality of SQL with the programming language of their application, thereby streamlining the development process and enhancing database interaction.

Github Repo

« Older Entries

Hard to Kill: Why Auto-Increment Primary Keys Can Make Data Sync Die Harder

Database Implementation Deep Dive

SQL Server

MySQL

PostgreSQL

Oracle

The Synchronization Challenge

The GUID Alternative

Traditional GUID Concerns

Modern Solutions

Recommendations

Conclusion

SyncFramework for XPO: Updated for .NET 8 & 9 and DevExpress 24.2.3!

What’s New

Framework Compatibility

Key Features

SyncFramework Update: Now Supporting .NET 9 and EfCore 9!

SyncFramework Update: Now Supporting .NET 9!

What’s New

Available Implementations

Package Statistics

Resources

ADOMD.NET: Beyond Rows and Columns – The Multidimensional Evolution of ADO.NET

The Challenge: The Sales Gap Report

Enter ADOMD.NET: A Specialized Solution

Key Features:

Multidimensional Analysis

Performance Optimization

Advanced Analytics Capabilities

Lessons Learned

Modern Applications

Conclusion

The AnyCPU Illusion: Native Dependencies in .NET Applications

Introduction

My Journey to Understanding AnyCPU’s Limitations

The Virtualization Experiment

The Demo Disaster

A Familiar Problem

The Native Dependency Challenge

Platform-Specific Loading Requirements

How Native Libraries are Loaded

The SQL Server Example

Impact on ORMs

Entity Framework Core

DevExpress XPO

Solutions and Best Practices

1. Architecture-Specific Deployment

2. Runtime Provider Selection

3. Managed Fallbacks

4. Deployment Considerations

Real-World Implications

Conclusion

User-Defined Functions in SQLite: Enhancing SQL with Custom C# Procedures

Scalar Functions

Example: Adding a Scalar Function

Operators

Example: Defining the regexp Function

Aggregate Functions

Example: Creating an Aggregate Function for Standard Deviation

Errors

Debugging

Search

Recent Posts

Categories

Archives