ML inference in PHP by example: leverage ONNX and Transformers on Symfony

AImachine learningPHPSymfonyopen source

19 November 2025

This blog is based on a presentation by Guillaume Moigneu at the Symfony 2024 conference.

Machine learning and AI are no longer limited to Python and Node.js. PHP developers can now run AI models directly in their applications using modern tools and libraries. This guide shows you how to implement machine learning inference in PHP using ONNX and Transformers.

What are transformers?

Transformers are a type of neural network architecture that revolutionized AI around 2016-2017. Before transformers, running machine learning models was slow and could not process data in parallel effectively.

Key Benefits of transformers:

Faster processing - Can handle multiple tasks simultaneously, unlike older sequential models
Better context understanding - Analyzes relationships between words through "self-attention" mechanisms
Parallel processing - Can work on various data points at once, making large-scale processing feasible
More efficient resource usage - Optimized for language processing and NLP tasks
Enabled modern AI - Made models like GPT and BERT possible by solving fundamental performance issues

How transformers work

When you send text to a transformer model, it goes through several steps to understand and process your input. First, the model performs tokenization, splitting your text into smaller pieces called tokens. These tokens are usually words, but can sometimes be smaller parts of words depending on the model.

Next comes the crucial self-attention mechanism, where the model analyzes relationships between all the tokens. This is what makes transformers special - they can understand how different words relate to each other in context. One word can mean completely different things in different situations, so the model needs to figure out the relationships between words to understand what you're actually trying to say.

The model then performs context mapping, using neural networks to process these relationships repeatedly. It feeds the parameters through multiple iterations, trying to understand the full meaning of your input. Finally, it generates an output based on all this contextual understanding, whether that's completing a sentence, classifying text, or answering a question.

For example, if you write "Symfony is a great framework for ___", the model analyzes the relationships between "Symfony," "framework," and "great" to predict what comes next.

Getting started with PHP machine learning

To run machine learning models in PHP, you need:

Transformers PHP Library
FFI Extension
ONNX Models

Installation

composer require codeboost/transformers-php

Important Note: The library downloads architecture-specific packages. Make sure you have the right version for your system (ARM64 for M1 Macs, AMD64 for Intel machines).

Practical use cases

1. Text Classification

Text classification helps you automatically categorize text content. Perfect for:

Analyzing user comments (positive/negative)
Spam detection
Content moderation
Review analysis

Example: Amazon Review Analysis

Guillaume demonstrated this concept by running a Symfony command that analyzed Nintendo Switch reviews. Here's an illustrative example based on the approach he described:

use Codeboost\TransformersPHP\Pipeline;

// Create a text classification pipeline
$pipeline = Pipeline::create('text-classification');

// Analyze reviews from your database
$reviews = $database->getReviews();
$results = [];

foreach ($reviews as $review) {
    $score = $pipeline($review->text);
    $results[] = [
        'review_id' => $review->id,
        'sentiment' => $score['label'], // 'POSITIVE' or 'NEGATIVE'
        'confidence' => $score['score']
    ];
}

Real Results: In a test with 108 Nintendo Switch reviews:

79 positive reviews (73%)
29 negative reviews (27%)
Processing time: ~14 reviews per second on standard CPU

2. Image classification

Automatically categorize and tag uploaded images. Use cases include:

Adding alt text to images
Content filtering
Automatic tagging
Object detection

Example: Hot Dog Detection

The speaker demonstrated this with a live hot dog detection app. Here's an illustrative example based on his description:

use Codeboost\TransformersPHP\Pipeline;

// Create image classification pipeline
$pipeline = Pipeline::create('image-classification');

// Load image using GD or ImageMagick
$image = imagecreatefromjpeg('uploaded_image.jpg');

// Classify the image
$result = $pipeline($image);

// Results: ['label' => 'hot dog', 'score' => 0.95]

Performance: Nearly instant results on small servers (under 1 second for typical web images).

3. Text generation

Generate text automatically for:

Alt text for images
Product descriptions
Content suggestions
Metadata generation

Example: Text Completion

Guillaume demonstrated this by asking, "Can a taco be considered a sandwich?" Here's an illustrative example based on the approach he showed:

use Codeboost\TransformersPHP\Pipeline;

// Create text generation pipeline
$pipeline = Pipeline::create('text-generation', [
    'model' => 'flan-t5-small',
    'temperature' => 0.7 // Controls creativity (0-1)
]);

$prompt = "Can a taco be considered a sandwich?";
$response = $pipeline($prompt);

// Result: "Yes, a taco is a sandwich made of bread"

System Requirements: Runs on 2 CPU cores with 2GB RAM. Response time: 1-2 seconds.

How it works behind the scenes

1. FFI Extension

The Foreign Function Interface (FFI) extension for PHP is not really a new extension, but it hasn't been used a lot until recently. FFI allows you to actually call C libraries directly and execute code directly without any translation. This is great because instead of using CGI and those kinds of old methods, you can actually run those libraries independently and way faster.

FFI has helped a lot with performance, and you could use it for many different things, like calling binaries and similar tasks. Guillaume mentioned that transformers are architecture-dependent, which is basically because they use FFI to actually run C code behind the scenes.

2. ONNX (Open Neural Network Exchange)

ONNX solves a major problem in machine learning. When you create and train a model, you typically use a specific framework like Google's TensorFlow or PyTorch. However, this creates a significant limitation: if a model was created with TensorFlow, you can't use it in a PyTorch production environment, and vice versa. You're locked into using the same framework for both training and inference.

ONNX provides a solution by creating a standard format that works across all frameworks. This open standard was developed through collaboration between major industry leaders, including Samsung, Apple, and others. With ONNX, you can train a model in any framework, convert it to the ONNX format, and then run it anywhere, regardless of the original training environment.

3. Model Sources

Hugging Face hub

If you want to try a lot of different things, go to Hugging Face. They have nearly a million models available right now, and you can test most of the models online as well. You will find all those different categories like video classification, question answering, table question, token classification, and whatever else you might need.

You will also find something that matches your use case. Maybe you're going to need to try and test, but find something that meets your needs that you can use with the library afterwards. They also have a category where you can find all the ONNX models, so you're sure they're actually compatible with what you want to do.

SmallML initiative

Something kind of new that emerged like six months ago is a new model category done by Hugging Face, also called SmallML. The goal is to produce models that are able to nearly output large texts like ChatGPT or Claude would do, but bundled in like 500 megabytes or less. This is great because 500 megabytes you can still load on most machines around, even if you've got a GPU on your own machine. You can actually run it through Ollama or whatever you want, or that PHP Transformer library as well.

Production deployment options

For small models (Text/image classification)

For text and image classification tasks, you can run these models directly on your PHP servers without any issues. The speaker demonstrated this running on a really small machine with two CPUs and two gigabytes of RAM, and it worked quite well. The models themselves are not really large; for example, the text classification model he used was like two megabytes, so it's fast to run and you can deploy it anywhere. It's super easy to implement in your real production applications today.

For large language models

Here you have three options:

External APIs
- OpenAI, Claude, etc.
- Easy to implement
- No control over uptime or responses
Hosted Model Endpoints
- Deploy your own model on GPU infrastructure
- Hugging Face Inference Endpoints
- More expensive, but you control the model
Self-Hosted
- Complete control
- Requires GPU infrastructure
- You handle all operations and maintenance

Best practices and testing

Model selection

Start with popular models on Hugging Face
Test multiple models for your specific use case
Popular models typically have better accuracy
Look for active development and regular updates

Testing your models

Create test datasets - Manually label 200+ examples
Run comparisons - Test model output against known results
Monitor inconsistencies - Same input should give consistent output
Check edge cases - Test with punctuation changes, typos

Important warning

Always monitor AI-generated content in production. Models can produce unexpected or harmful outputs. Consider:

Content filtering
Human review for sensitive applications
Fallback mechanisms
Regular monitoring and alerts

Conclusion

PHP developers can now leverage machine learning directly in their applications without external dependencies. While there are limitations (no GPU support yet), the current capabilities are sufficient for many real-world use cases.

The combination of FFI, ONNX, and the Transformers PHP library makes it possible to:

Analyze user-generated content in real-time
Automatically classify and tag images
Generate helpful text content
Build smarter web applications

Start small, test thoroughly, and gradually expand your use of AI in PHP applications. The future of PHP and machine learning is just getting started.

ML inference in PHP by example: leverage ONNX and Transformers on Symfony

What are transformers?

Key Benefits of transformers:

How transformers work

Getting started with PHP machine learning

Installation

Practical use cases

1. Text Classification

Example: Amazon Review Analysis

2. Image classification

Example: Hot Dog Detection

3. Text generation

Example: Text Completion

How it works behind the scenes

1. FFI Extension

2. ONNX (Open Neural Network Exchange)

3. Model Sources

Hugging Face hub

SmallML initiative

Production deployment options

Best practices and testing

Model selection

Testing your models

Important warning

Conclusion

Stay updated

Your greatest work
is just on the horizon

ML inference in PHP by example: leverage ONNX and Transformers on Symfony

What are transformers?

Key Benefits of transformers:

How transformers work

Getting started with PHP machine learning

Installation

Practical use cases

1. Text Classification

Example: Amazon Review Analysis

2. Image classification

Example: Hot Dog Detection

3. Text generation

Example: Text Completion

How it works behind the scenes

1. FFI Extension

2. ONNX (Open Neural Network Exchange)

3. Model Sources

Hugging Face hub

SmallML initiative

Production deployment options

Best practices and testing

Model selection

Testing your models

Important warning

Conclusion

Stay updated

Your greatest work.css-2vew0q{display:inline-block;background:rgb(250, 65, 255);background:linear-gradient(90deg, #806bff 0%, #ed49f0 100%);-webkit-background-clip:text;-webkit-background-clip:text;background-clip:text;-webkit-text-fill-color:transparent;}is just on the horizon

Your greatest work
is just on the horizon