• Formerly Platform.sh
  • Contact us
  • Docs
  • Login
Watch a demoFree trial
Blog
Blog
BlogProductCase studiesNewsInsights
Blog

Mastering linked data in PHP: graphs and algorithms

SymfonyConPresentationGitdataCLIrustautomation
24 September 2024
Christian Rades
Christian Rades
Principal Engineer
Share

This blog is based on a presentation by Christian Rades, Principal Engineer at Shopware, at the Symfony conference 2023. We utilized AI tools for transcription and to enhance the structure and clarity of the content.

When Christian started with an unusual proposition: to explore the "boring kind of graphs", not colorful charts, but kinds made of points and lines, used to show how things connect, what followed was a fascinating journey through centuries of mathematical thinking and its surprisingly practical applications to modern software development.

The seven bridges problem: A quick story that changed the field

The story of graphs in computer science begins in 1736 in the Prussian city of Königsberg (now Kaliningrad). Picture this: a group of friends at a bar, perhaps after a few beers, pondering a seemingly simple question – could you walk through the city crossing each of its seven bridges exactly once?

What began as a casual puzzle evolved into a profound mathematical breakthrough. Mathematician Leonhard Euler tackled this problem not with traditional geometry or algebra, but by inventing an entirely new way of thinking. He abstracted the physical city into something simpler: points (representing the land masses) connected by edges (representing the bridges).

Euler's insight was elegant. To traverse a path where you cross each bridge only once, you must enter and exit each point. This means that each point requires an even number of connections – one to enter and one to exit. Since Königsberg had points with odd numbers of bridges, the puzzle was mathematically impossible.

This abstraction – reducing complex structures to points and connections – became the foundation of graph theory.

What actually are graphs?

Christian breaks down the mathematical definition into human terms: "Vertices are points, edges are how those points are connected. Nothing else."

Unlike graphics in computer science, graphs have no coordinates or physical positions. They're purely about relationships. You can draw them however you like; the arrangement is just to help humans understand them better.

This simplicity is powerful. Tangled nets, sprawling cities, and complex maps all share something fundamental: they can all be transformed into graphs. Each intersection becomes a point, each connecting path becomes an edge. This is precisely how tools like Google Maps work: they convert road networks into graphs, then use graph algorithms to trace your route from point A to point B.

Navigating Data Structures

Christian demonstrates this concept through JQ, a command-line tool for processing JSON data. When you write a JQ query, such as accessing data[0].address.street. You're essentially navigating through a graph.

He walks through an example: starting at the root JSON object, moving to the data array, selecting the first element, accessing the address field, and finally reaching the street value. Each step is moving from one node to another along edges with meaningful names.

"This is the Google Maps view of this data structure," Christian explains. By separating the structure of data from the data itself, you can perform operations purely on structure. You can define sub-paths and reuse them: "Get into the address field, then get into the street field" – and apply this pattern across entire datasets.

The object-relational mapping challenge

Every developer working with databases knows this pain: objects exist in memory without a specific order, but databases require order due to foreign key constraints. You can't reference data that the database does not yet know about.

At Shopware, Christian's team aimed to create a developer-friendly API that allowed users to submit inventory data as nested JSON, including products with their categories - all in one go. However, there was a problem: you can't create a product referencing categories without those categories already existing and having IDs.

This is where thinking in graphs becomes transformative. The problem is primarily about directed graphs, where the direction of connections is significant. One table can reference another through a foreign key, but not vice versa, or a cycle is created.

Topological sorting: ordering the chaos

Christian introduces Kahn's algorithm for topological sorting. The goal is to determine the correct order to insert data into the database.

The algorithm starts by counting incoming connections for each node. Then it follows a simple process:

  1. Select any node with zero incoming connections (if none exist, you have a cycle)
  2. Add it to the insertion stack
  3. Remove its edges from the graph
  4. Decrease the incoming connection count for affected nodes
  5. Repeat

Working through an example of a product with categories, the algorithm reveals the correct insertion order: root entities first, then their dependencies, and finally the product itself. Each step respects foreign key constraints automatically.

Detecting cycles: when data goes wrong

But what happens when your data has circular dependencies? Christian explains that once you've modeled your problem as a graph, you can simply “Google it”,  established algorithms already exist.

Using depth-first search, you can detect cycles by traversing the graph and maintaining a stack. When you encounter a node you've already visited in your current path, you've found a cycle. Christian demonstrates how this reveals broken data: “Products reference root, root references awesome things, awesome things reference products”, an impossible situation.

With this detection, you can provide users with precise error messages or, in certain domains, make informed decisions about how to break cycles effectively.

Git history as a social network

Christian's most intriguing application transforms Git repositories into social networks – not of people, but of files.

Git is already a Directed Acyclic Graph (DAG) that flows from present to past. Each commit represents a meaningful change. However, Christian views it differently: files that change together in commits are "talking" to each other.

If a commit creates a new service file D and modifies file A, those files are now related. A contained a test that broke, or needed updating for the new service. These connections accumulate across the entire commit history.

By transforming this data into a graph where files are nodes and co-changes are weighted edges (the weight being how often they changed together), Christian can analyze the architecture itself.

Between centrality: finding architectural bottlenecks

Borrowing from social network analysis, Christian applies "betweenness centrality" – a metric that identifies which nodes connect different parts of a network. Companies like Facebook use this to identify influential people; Christian uses it to identify influential files.

Running this analysis on Shopware's codebase revealed eye-opening results. Using a Sunburst diagram showing each folder's contribution to overall connectedness:

  • The snippets folder contributed 25% of all connectedness. Why? Because UI changes require updating translation strings, a bridge is needed between the administration layer and snippets.
  • E2E tests showed high centrality – but this revealed brittleness. Every UI change broke tests, forcing developers to modify test files frequently.
  • The most brittle PHP test? The API schema validator, which intentionally compared the entire API schema against a committed JSON file. Any API change triggered this test.

Christian could identify these architectural "boulders" – obstacles that developers constantly encounter – purely from analyzing Git history.

Beyond data: thinking structurally

Christian's closing message challenges developers to expand their toolkit. We have standard patterns for data: factories, commands, and various design patterns. But what about structure?

"We can analyze how our data looks and verify it with that, a bit like a type system," he explains. But more powerfully, you can calculate new information from the structure itself. Your commit history, your dependency graph, your data relationships – these structures contain insights waiting to be extracted using standard, well-documented algorithms.

He leaves us with another example: using dependency graphs from tools like deptrac and applying layout algorithms to visualize which parts of a system are tightly coupled. By coloring nodes by folder, teams could identify when supposedly independent components were in fact closely related.

The practical takeaway

Christian's examples demonstrate that graph theory is not merely abstract mathematics confined to textbooks. It's a practical framework for solving real development challenges:

  • Ordering database insertions by modeling dependencies as directed graphs
  • Detecting invalid data through cycle detection
  • Analyzing codebase health through commit history
  • Identifying brittle tests and architectural bottlenecks
  • Visualizing system coupling and dependencies

The key insight? Many problems we face aren't really about data manipulation – they're about understanding and working with structure. And for structure, graphs and their algorithms provide robust, proven solutions.

As Christian puts it, "Graphs are everywhere." Once you start seeing your problems through this lens, you'll find applications in unexpected places. The algorithms exist, the libraries are available, and the patterns are well-documented. All that's needed is recognizing when you're facing a graph problem – and Christian's talk provides plenty of inspiration for making that recognition.

For developers interested in exploring these concepts further, Christian's tools are available on GitHub (including "rorqual," his Rust-based repository analyzer), and the techniques he describes are built on widely available libraries such as NetworkX for Python and various graph libraries in other languages.

The next time you're wrestling with dependencies, navigating complex data structures, or wondering why specific files always seem to change together, remember: you might just be looking at a graph problem in disguise.

Stay updated

Subscribe to our monthly newsletter for the latest updates and news.

Your greatest work
is just on the horizon

Free trial
UpsunFormerly Platform.sh

Join our monthly newsletter

Compliant and validated

ISO/IEC 27001SOC 2 Type 2PCI L1HIPAATX-RAMP
© 2025 Upsun. All rights reserved.