by Joche Ojeda | Jan 22, 2025 | ADO, ADO.NET, C#, Data Synchronization, EfCore, XPO, XPO Database Replication
Working with the SyncFramework, I’ve noticed a recurring pattern when discussing schema design with customers. One crucial question that often surprises them is about their choice of primary keys: “Are you using auto-incremental integers or unique identifiers (like GUIDs)?”
Approximately 90% of users rely on auto-incremental integer primary keys. While this seems like a straightforward choice, it can create significant challenges for data synchronization. Let’s dive deep into how different database engines handle auto-increment values and why this matters for synchronization scenarios.
Database Implementation Deep Dive
SQL Server
SQL Server uses the IDENTITY property, storing current values in system tables (sys.identity_columns) and caching them in memory for performance. During restarts, it reads the last used value from these system tables. The values are managed as 8-byte numbers internally, with new ranges allocated when the cache is exhausted.
MySQL
MySQL’s InnoDB engine maintains auto-increment counters in memory and persists them to the system tablespace or table’s .frm file. After a restart, it scans the table to find the maximum used value. Each table has its own counter stored in the metadata.
PostgreSQL
PostgreSQL takes a different approach, using separate sequence objects stored in the pg_class catalog. These sequences maintain their own relation files containing crucial metadata like last value, increment, and min/max values. The sequence data is periodically checkpointed to disk for durability.
Oracle
Oracle traditionally uses sequences and triggers, with modern versions (12c+) supporting identity columns. The sequence information is stored in the SEQ$ system table, tracking the last number used, cache size, and increment values.
The Synchronization Challenge
This diversity in implementation creates several challenges for data synchronization:
- Unpredictable Sequence Generation: Even within the same database engine, gaps can occur due to rolled-back transactions or server restarts.
- Infrastructure Dependencies: The mechanisms for generating next values are deeply embedded within each database engine and aren’t easily accessible to frameworks like Entity Framework or XPO.
- Cross-Database Complexity: When synchronizing across different database instances, coordinating auto-increment values becomes even more complex.
The GUID Alternative
Using GUIDs (Globally Unique Identifiers) as primary keys offers a solution to these synchronization challenges. While GUIDs come with their own set of considerations, they provide guaranteed uniqueness across distributed systems without requiring centralized coordination.
Traditional GUID Concerns
- Index fragmentation
- Storage size
- Performance impact
Modern Solutions
These concerns have been addressed through:
- Sequential GUID generation techniques
- Improved indexing in modern databases
- Optimizations in .NET 9
Recommendations
When designing systems that require data synchronization:
- Consider using GUIDs instead of auto-increment integers for primary keys
- Evaluate sequential GUID generation for better performance
- Understand that auto-increment values, while simple, can complicate synchronization scenarios
- Plan for the infrastructure needed to maintain consistent primary key generation across your distributed system
Conclusion
The choice of primary key strategy significantly impacts your system’s ability to handle data synchronization effectively. While auto-increment integers might seem simpler at first, understanding their implementation details across different databases reveals why GUIDs often provide a more robust solution for distributed systems.
Remember: Data synchronization is not a trivial problem, and your primary key strategy plays a crucial role in its success. Take the time to evaluate your requirements and choose the appropriate approach for your specific use case.
Till next time, happy delta encoding.
by Joche Ojeda | Jan 22, 2025 | ADO.NET, C#, Data Synchronization, Database, DevExpress, XPO, XPO Database Replication
SyncFramework for XPO is a specialized implementation of our delta encoding synchronization library, designed specifically for DevExpress XPO users. It enables efficient data synchronization by tracking and transmitting only the changes between data versions, optimizing both bandwidth usage and processing time.
What’s New
- Base target framework updated to .NET 8.0
- Added compatibility with .NET 9.0
- Updated DevExpress XPO dependencies to 24.2.3
- Continued support for delta encoding synchronization
- Various performance improvements and bug fixes
Framework Compatibility
- Primary Target: .NET 8.0
- Additional Support: .NET 9.0
Our XPO implementation continues to serve the DevExpress community.
Key Features
- Seamless integration with DevExpress XPO
- Efficient delta-based synchronization
- Support for multiple database providers
- Cross-platform compatibility
- Easy integration with existing XPO and XAF applications
As always, if you own a license, you can compile the source code yourself from our GitHub repository. The framework maintains its commitment to providing reliable data synchronization for XPO applications.
Happy Delta Encoding! 🚀
by Joche Ojeda | Jan 21, 2025 | ADO.NET, C#, Data Synchronization, EfCore
SyncFramework Update: Now Supporting .NET 9!
SyncFramework is a C# library that simplifies data synchronization using delta encoding technology. Instead of transferring entire datasets, it efficiently synchronizes by tracking and transmitting only the changes between data versions, significantly reducing bandwidth and processing overhead.
What’s New
- All packages now target .NET 9
- BIT.Data.Sync packages updated to support the latest framework
- Entity Framework Core packages upgraded to EF Core 9
- Various minor fixes and improvements
Available Implementations
- SyncFramework for XPO: For DevExpress XPO users
- SyncFramework for Entity Framework Core: For EF Core users
Package Statistics
Our packages have been serving the community well, with steady adoption:
- BIT.Data.Sync: 2,142 downloads
- BIT.Data.Sync.AspNetCore: 1,064 downloads
- BIT.Data.Sync.AspNetCore.Xpo: 521 downloads
- BIT.Data.Sync.EfCore: 1,691 downloads
- BIT.Data.Sync.EfCore.Npgsql: 1,120 downloads
- BIT.Data.Sync.EfCore.Pomelo.MySql: 1,172 downloads
- BIT.Data.Sync.EfCore.Sqlite: 887 downloads
- BIT.Data.Sync.EfCore.SqlServer: 982 downloads
Resources
NuGet Packages
Source Code
As always, you can compile the source code yourself from our GitHub repository. The framework continues to provide reliable data synchronization across different platforms and databases.
Happy Delta Encoding! 🚀
by Joche Ojeda | Jan 20, 2025 | ADO, ADO.NET, Database, dotnet
When I first encountered the challenge of migrating hundreds of Visual Basic 6 reports to .NET, I never imagined it would lead me down a path of discovering specialized data analytics tools. Today, I want to share my experience with ADOMD.NET and how it could have transformed our reporting challenges, even though we couldn’t implement it due to our database constraints.
The Challenge: The Sales Gap Report
The story begins with a seemingly simple report called “Sales Gap.” Its purpose was critical: identify periods when regular customers stopped purchasing specific items. For instance, if a customer typically bought 10 units monthly from January to May, then suddenly stopped in June and July, sales representatives needed to understand why.
This report required complex queries across multiple transactional tables:
- Invoicing
- Sales
- Returns
- Debits
- Credits
Initially, the report took about a minute to run. As our data grew, so did the execution time—eventually reaching an unbearable 15 minutes. We were stuck with a requirement to use real-time transactional data, making traditional optimization techniques like data warehousing off-limits.
Enter ADOMD.NET: A Specialized Solution
ADOMD.NET (ActiveX Data Objects Multidimensional .NET) emerged as a potential solution. Here’s why it caught my attention:
Key Features:
-
Multidimensional Analysis
Unlike traditional SQL queries, ADOMD.NET uses MDX (Multidimensional Expressions), specifically designed for analytical queries. Here’s a basic example:
string mdxQuery = @"
SELECT
{[Measures].[Sales Amount]} ON COLUMNS,
{[Date].[Calendar Year].MEMBERS} ON ROWS
FROM [Sales Cube]
WHERE [Product].[Category].[Electronics]";
-
Performance Optimization
ADOMD.NET is built for analytical workloads, offering better performance for complex calculations and aggregations. It achieves this through:
- Specialized data structures for multidimensional analysis
- Efficient handling of hierarchical data
- Built-in support for complex calculations
-
Advanced Analytics Capabilities
The tool supports sophisticated analysis patterns like:
string mdxQuery = @"
WITH MEMBER [Measures].[GrowthVsPreviousYear] AS
([Measures].[Sales Amount] -
([Measures].[Sales Amount], [Date].[Calendar Year].PREVMEMBER)
)/([Measures].[Sales Amount], [Date].[Calendar Year].PREVMEMBER)
SELECT
{[Measures].[Sales Amount], [Measures].[GrowthVsPreviousYear]}
ON COLUMNS...";
Lessons Learned
While we couldn’t implement ADOMD.NET due to our use of Pervasive Database instead of SQL Server, the investigation taught me valuable lessons about report optimization:
- The importance of choosing the right tools for analytical workloads
- The limitations of running complex analytics on transactional databases
- The value of specialized query languages for different types of data analysis
Modern Applications
Today, ADOMD.NET continues to be relevant for organizations using:
- SQL Server Analysis Services (SSAS)
- Azure Analysis Services
- Power BI Premium datasets
If I were facing the same challenge today with SQL Server, ADOMD.NET would be my go-to solution for:
- Complex sales analysis
- Customer behavior tracking
- Performance-intensive analytical reports
Conclusion
While our specific situation with Pervasive Database prevented us from using ADOMD.NET, it remains a powerful tool for organizations using Microsoft’s analytics stack. The experience taught me that sometimes the solution isn’t about optimizing existing queries, but about choosing the right specialized tools for analytical workloads.
Remember: Just because you can run analytics on your transactional database doesn’t mean you should. Tools like ADOMD.NET exist for a reason, and understanding when to use them can save countless hours of optimization work and provide better results for your users.
by Joche Ojeda | Jan 12, 2025 | ADO.NET, C#, CPU, dotnet, ORM, XAF, XPO
Introduction
In the .NET ecosystem, “AnyCPU” is often considered a silver bullet for cross-platform deployment. However, this assumption can lead to significant problems when your application depends on native assemblies. In this post, I want to share a personal story that highlights how I discovered these limitations and how native dependencies affect the true portability of AnyCPU applications, especially for database access through ADO.NET and popular ORMs.
My Journey to Understanding AnyCPU’s Limitations
Every year, around Thanksgiving or Christmas, I visit my friend, brother, and business partner Javier. Two years ago, during one of these visits, I made a decision that would lead me to a pivotal realization about AnyCPU architecture.
At the time, I was tired of traveling with my bulky MSI GE72 Apache Pro-24 gaming laptop. According to MSI’s official specifications, it weighed 5.95 pounds—but that number didn’t include the hefty charger, which brought the total to around 12 pounds. Later, I upgraded to an MSI GF63 Thin, which was lighter at 4.10 pounds—but with the charger, it was still around 7.5 pounds. Lugging these laptops through airports felt like a workout.
Determined to travel lighter, I purchased a MacBook Air with the M2 chip. At just 2.7 pounds, including the charger, the MacBook Air felt like a breath of fresh air. The Apple Silicon chip was incredibly fast, and I immediately fell in love with the machine.
Having used a MacBook Pro with Bootcamp and Windows 7 years ago, I thought I could recreate that experience by running a Windows virtual machine on my MacBook Air to check projects and do some light development while traveling.
The Virtualization Experiment
As someone who loves virtualization, I eagerly set up a Windows virtual machine on my MacBook Air. I grabbed my trusty Windows x64 ISO, set up the virtual machine, and attempted to boot it—but it failed. I quickly realized the issue was related to CPU architecture. My x64 ISO wasn’t compatible with the ARM-based M2 chip.
Undeterred, I downloaded a Windows 11 ISO for ARM architecture and created the VM. Success! Windows was up and running, and I installed Visual Studio along with my essential development tools, including DevExpress XPO (my favorite ORM).
The Demo Disaster
The real test came during a trip to Dubai, where I was scheduled to give a live demo showcasing how quickly you can develop Line-of-Business (LOB) apps with XAF. Everything started smoothly until I tried to connect my XAF app to the database. Despite my best efforts, the connection failed.
In the middle of the demo, I switched to an in-memory data provider to salvage the presentation. After the demo, I dug into the issue and realized the root cause was related to the CPU architecture. The native database drivers I was using weren’t compatible with the ARM architecture.
A Familiar Problem
This situation reminded me of the transition from x86 to x64 years ago. Back then, I encountered similar issues where native drivers wouldn’t load unless they matched the process architecture.
The Native Dependency Challenge
Platform-Specific Loading Requirements
Native DLLs must exactly match the CPU architecture of your application:
- If your app runs as x86, it can only load x86 native DLLs.
- If running as x64, it requires x64 native DLLs.
- ARM requires ARM-specific binaries.
- ARM64 requires ARM64-specific binaries.
There is no flexibility—attempting to load a DLL compiled for a different architecture results in an immediate failure.
How Native Libraries are Loaded
When your application loads a native DLL, the operating system follows a specific search pattern:
- The application’s directory
- System directories (System32/SysWOW64)
- Directories listed in the PATH environment variable
Crucially, these native libraries must match the exact architecture of the running process.
// This seemingly simple code
[DllImport("native.dll")]
static extern void NativeMethod();
// Actually requires:
// - native.dll compiled for x86 when running as 32-bit
// - native.dll compiled for x64 when running as 64-bit
// - native.dll compiled for ARM64 when running on ARM64
The SQL Server Example
Let’s look at SQL Server connectivity, a common scenario where the AnyCPU illusion breaks down:
// Traditional ADO.NET connection
using (var connection = new SqlConnection(connectionString))
{
// This requires SQL Native Client
// Which must match the process architecture
await connection.OpenAsync();
}
Even though your application is compiled as AnyCPU, the SQL Native Client must match the process architecture. This becomes particularly problematic on newer architectures like ARM64, where native drivers may not be available.
Impact on ORMs
Entity Framework Core
Entity Framework Core, despite its modern design, still relies on database providers that may have native dependencies:
public class MyDbContext : DbContext
{
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
// This configuration depends on:
// 1. SQL Native Client
// 2. Microsoft.Data.SqlClient native components
optionsBuilder.UseSqlServer(connectionString);
}
}
DevExpress XPO
DevExpress XPO faces similar challenges:
// XPO configuration
string connectionString = MSSqlConnectionProvider.GetConnectionString("server", "database");
XpoDefault.DataLayer = XpoDefault.GetDataLayer(connectionString, AutoCreateOption.DatabaseAndSchema);
// The MSSqlConnectionProvider relies on the same native SQL Server components
Solutions and Best Practices
1. Architecture-Specific Deployment
Instead of relying on AnyCPU, consider creating architecture-specific builds:
<PropertyGroup>
<Platforms>x86;x64;arm64</Platforms>
<RuntimeIdentifiers>win-x86;win-x64;win-arm64</RuntimeIdentifiers>
</PropertyGroup>
2. Runtime Provider Selection
Implement smart provider selection based on the current architecture:
public static class DatabaseProviderFactory
{
public static IDbConnection GetProvider()
{
return RuntimeInformation.ProcessArchitecture switch
{
Architecture.X86 => new SqlConnection(), // x86 native provider
Architecture.X64 => new SqlConnection(), // x64 native provider
Architecture.Arm64 => new Microsoft.Data.SqlClient.SqlConnection(), // ARM64 support
_ => throw new PlatformNotSupportedException()
};
}
}
3. Managed Fallbacks
Implement fallback strategies when native providers aren’t available:
public class DatabaseConnection
{
public async Task<IDbConnection> CreateConnectionAsync()
{
try
{
var connection = new SqlConnection(_connectionString);
await connection.OpenAsync();
return connection;
}
catch (DllNotFoundException)
{
var managedConnection = new Microsoft.Data.SqlClient.SqlConnection(_connectionString);
await managedConnection.OpenAsync();
return managedConnection;
}
}
}
4. Deployment Considerations
- Include all necessary native dependencies for each target architecture.
- Use architecture-specific directories in your deployment.
- Consider self-contained deployment to include the correct runtime.
Real-World Implications
This experience taught me that while AnyCPU provides excellent flexibility for managed code, it has limitations when dealing with native dependencies. These limitations become more apparent in scenarios like cloud deployments, ARM64 devices, and live demos.
Conclusion
The transition to ARM architecture is accelerating, and understanding the nuances of AnyCPU and native dependencies is more important than ever. By planning for architecture-specific deployments and implementing fallback strategies, you can build more resilient applications that can thrive in a multi-architecture world.
by Joche Ojeda | Jan 15, 2024 | ADO.NET, C#, Database, Sqlite
SQLite, known for its simplicity and lightweight architecture, offers unique opportunities for developers to integrate custom functions directly into their applications. Unlike most databases that require learning an SQL dialect for procedural programming, SQLite operates in-process with your application. This design choice allows developers to define functions using their application’s programming language, enhancing the database’s flexibility and functionality.
Scalar Functions
Scalar functions in SQLite are designed to return a single value for each row in a query. Developers can define new scalar functions or override built-in ones using the CreateFunction method. This method supports various data types for parameters and return types, ensuring versatility in function creation. Developers can specify the state argument to pass a consistent value across all function invocations, avoiding the need for closures. Additionally, marking a function as isDeterministic optimizes query compilation by SQLite if the function’s output is predictable based on its input.
Example: Adding a Scalar Function
connection.CreateFunction(
"volume",
(double radius, double height) => Math.PI * Math.Pow(radius, 2) * height);
var command = connection.CreateCommand();
command.CommandText = @"
SELECT name,
volume(radius, height) AS volume
FROM cylinder
ORDER BY volume DESC
";
Operators
SQLite implements several operators as scalar functions. Defining these functions in your application overrides the default behavior of these operators. For example, functions like glob, like, and regexp can be custom-defined to change the behavior of their corresponding operators in SQL queries.
Example: Defining the regexp Function
connection.CreateFunction(
"regexp",
(string pattern, string input) => Regex.IsMatch(input, pattern));
var command = connection.CreateCommand();
command.CommandText = @"
SELECT count()
FROM user
WHERE bio REGEXP '\w\. {2,}\w'
";
var count = command.ExecuteScalar();
Aggregate Functions
Aggregate functions return a consolidated value from multiple rows. Using CreateAggregate, developers can define and override these functions. The seed argument sets the initial context state, and the func argument is executed for each row. The resultSelector parameter, if specified, calculates the final result from the context after processing all rows.
Example: Creating an Aggregate Function for Standard Deviation
connection.CreateAggregate(
"stdev",
(Count: 0, Sum: 0.0, SumOfSquares: 0.0),
((int Count, double Sum, double SumOfSquares) context, double value) => {
context.Count++;
context.Sum += value;
context.SumOfSquares += value * value;
return context;
},
context => {
var variance = context.SumOfSquares - context.Sum * context.Sum / context.Count;
return Math.Sqrt(variance / context.Count);
});
var command = connection.CreateCommand
();
command.CommandText = @"
SELECT stdev(gpa)
FROM student
";
var stdDev = command.ExecuteScalar();
Errors
When a user-defined function throws an exception in SQLite, the message is returned to the database engine, which then raises an error. Developers can customize the SQLite error code by throwing a SqliteException with a specific SqliteErrorCode.
Debugging
SQLite directly invokes the implementation of user-defined functions, allowing developers to insert breakpoints and leverage the full .NET debugging experience. This integration facilitates debugging and enhances the development of robust, error-free custom functions.
This article illustrates the power and flexibility of SQLite’s approach to user-defined functions, demonstrating how developers can extend the functionality of SQL with the programming language of their application, thereby streamlining the development process and enhancing database interaction.
Github Repo