Being GDPR-compliant across all your projects is a daily challenge—especially if you’re managing sensitive user data on your projects. Upsun adopts a GDPR everywhere approach with high levels of built-in security and compliance as standard—but there are ways to secure your data on our PaaS further when it comes to preview environments.
Each time you create a new Git branch on a project on Upsun, the corresponding environment inherits the data (assets and database) from its parent. This means that potentially sensitive data from your production website could be exposed to the preview environment.
So, how do you navigate this and ensure your application remains compliant? Two words: data sanitization. The deliberate and permanent erasure of sensitive data from a storage device making the data non-recoverable. In this article, I will share the methods of data sanitization that you can implement for preview environments to ensure that your data remains safe at every stage of development.
Some necessary resources before we start
We have some prerequisites to ensure that you can follow the solutions and steps detailed in this article—please make sure you have installed the following:
- Git
- PHP
- jq library
- Symfony CLI (please note: if you’re not using Symfony stack, you will need to download the Upsun CLI)
Methods for application data sanitization
For the purpose of this article, we are going to focus on preview environment data sanitization on Symfony, however, the methods detailed apply to all frameworks.
Throughout the article, we’re going to walk through the various methods available to sanitize Symfony preview environment data on Upsun—5 methods to be exact—which you can weigh up and choose the best one for you. However, make sure that you complete the create a command step before proceeding with any method.
If you already know the method you would prefer to use, go ahead and click on the relevant title below and we’ll take you straight there:
- Manual data sanitization
- Using environment inheritance
- Using a hook
- Using runtime operations and activity scripts
- Using shell scripts
To carry out any of the five data sanitization methods listed above, we need a callable to sanitize our environments. There are two possible ways to do so:
- Using an SQL script to update or fake all sensitive data
- Using a Symfony command to do it, perhaps using the fakerPHP bundle
Since we are using a Symfony Demo application, we will use the second option. Do the following, from the main
Git branch:
symfony composer require --dev fakerphp/faker
git add composer.json composer.lock && git commit -m "composer require --dev fakerphp bundle"
Then open your code in your favorite IDE and create a new Symfony command, in an SRC/command/SanitizeDataCommand.php file, with the following:
<?php
/* src/Command/SanitizeDataCommand.php */
namespace App\Command;
use App\Entity\User;
use App\Repository\UserRepository;
use Doctrine\ORM\EntityManagerInterface;
use Faker;
use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Console\Style\SymfonyStyle;
#[AsCommand(
name: 'app:sanitize-data',
description: 'Sanitize user data (username and email).',
aliases: ['app:sanitize']
)]
class SanitizeDataCommand extends Command
{
private SymfonyStyle $io;
public function __construct(private UserRepository $userRepository, private EntityManagerInterface $entityManager)
{
parent::__construct();
}
protected function configure()
{
$this
->setDescription('This command allows you to sanitize user data (username and email).');
}
protected function initialize(InputInterface $input, OutputInterface $output): void
{
$this->io = new SymfonyStyle($input, $output);
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
$users = $this->userRepository->findAll();
$this->io->progressStart(count($users));
$this->entityManager->getConnection()->beginTransaction(); // suspend auto-commit
try {
/** @var User $user */
foreach ($users as $user) {
$this->io->progressAdvance();
// initialize faker
$faker = Faker\Factory::create();
$this->io->text('faking user '.$user->getUsername());
// fake user info
$user->setUsername(uniqid($faker->userName()));
$user->setEmail($faker->email());
// please adapt to your needs
}
$this->entityManager->flush();
$this->entityManager->getConnection()->commit();
$this->io->progressFinish();
} catch (\Exception $e) {
$this->entityManager->getConnection()->rollBack();
throw $e;
}
return Command::SUCCESS;
}
}
This command app:sanitize-data
uses the UserRepository and fakes username and email from the default User Symfony entity. Please adapt to your needs. Then push your code to the main
branch:
git add src/Command/SanitizeDataCommand.php && git commit -m "sanitize data command"
symfony deploy
Now that your source code contains a Symfony command to sanitize your data, we will use it manually in a new preview environment. Starting with creating a new staging branch and waiting for the process to finish, like so:
symfony branch staging --type=staging
Then, execute your newly-created Symfony command on your Upsun staging environment, as seen below:
symfony ssh php bin/console -e dev app:sanitize-data
Et voilà, your preview environment data is sanitized!
In this section, we will create a preview environment, sanitize its data, and then make all new environments inherit from that preview environment.
As mentioned at the top of this article, each time you create a new Git branch on Upsun, the created environment will inherit data from the parent environment. However, it’s possible to change the default data inheritance and set it later to synchronize data from a new parent—the preview environment we will create.
The Symfony CLI offers the ability to create a branch without a parent, using option --no-clone-parent
, and then setting the parent to staging (a.k.a preview) which ensures new branches inherit the preview environment’s data. Follow the instructions in step 1 for details on how to sanitize preview environment data manually to ensure any future branches inherit sanitized, GDPR-compliant data.
symfony checkout main
symfony branch dev --no-clone-parent
symfony env:info -e dev parent staging
symfony sync -e dev data
And that’s it, your new dev
environment is now created with sanitized data from your preview environment.
Rather than relying on inheritance, it may be desirable to sanitize certain data on each deployment. In this case, we can move our script call to the hooks
section of the configuration.
The type of hook you choose is up to you—deploy or post_deploy hooks—but here are a few things to keep in mind:
- Long-running script within the deploy hook will need to extend the deployment time of an application.
- Long-running script within the post_deploy hook could make non-compliant/critical data momentarily public while the sanitization is taking place.
- Redeploys; if sanitization is something you’d like to be able to manually trigger with a redeploy, sanitizing will take place on each redeploy only if placed in the post_deploy hook.
To execute a Symfony command during the post_deploy
hook, add the following in your .upsun/config.yaml
:
applications:
app:
hooks:
build: ...
deploy: ...
post_deploy: |
if [ "$PLATFORM_ENVIRONMENT_TYPE" != production ]; then
# The sanitization of the database should happen here (since it's non-production)
php bin/console -e dev app:sanitize-data
fi
Then push your code to the main branch:
git checkout main && git add .upsun/config.yaml && git commit -m "add sanitize data command to post_deploy hook"
symfony deploy
There is another option that allows you to create a custom trigger that is run in response to certain activities that take place on the project. Namely, when synchronizing an environment with its parent, we could sync back non-anonymized data from the parent (e.g., if synchronizing from the production environment).
The two components that will make this work are:
- A runtime operation: will allow you to trigger one-off commands or scripts on your project. Similar to crons, they run in the application container but not on a specific schedule.
- An activity script: a JavaScript piece of code that will be run in response to certain activities taking place at the project, environment, or even organization level.
So we will add an integration (activity script) that responds to certain events to execute a runtime operation to sanitize data on the fly, see add an integration of an activity script below.
How to create a runtime operation
To configure a runtime operation, we need to add a new top-level YAML key in our .upsun/config.yaml
file with the following:
applications:
app:
operations:
sanitize:
role: admin
commands:
start: |
if [ "$PLATFORM_ENVIRONMENT_TYPE" != production ]; then
# The sanitization of the database should happen here (since it's non-production)
php bin/console -e dev app:sanitize-data
fi
Then push your file to the main
branch and deploy.
git checkout main
git add .upsun/config.yaml && git commit -m "add runtime operation to sanitize data"
symfony deploy
And if you want to test this runtime operation manually, you can use the following:
symfony operation:run sanitize --app=app
How to create an activity script
Upsun supports custom scripts that can fire in response to any activity. This script is executed outside of the environment context and so, we need to re-create this context for the activity script to be executed with the necessary rights. To do so, create a new file src/runtime/sanitize.js
with the following:
// src/runtime/sanitize.js
let app_container = "app";
let runtime_operation_name = "sanitize";
if (!variables.api_token) {
console.log("Variable API Token is not defined!");
console.log("Please define an environment variable with your API Token using command: ");
console.log("upsun project:curl /integrations/<INTEGRATION_ID>/variables -X POST -d '{\"name\": \"api_token\", \"value\": \"<API_TOKEN>\", \"is_sensitive\": true, \"is_json\": false}' ");
} else {
console.log("OAuth2 API Token defined");
let resp = fetch('https://auth.api.platform.sh/oauth2/token', {
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded'
},
body: "client_id=platform-api-user&grant_type=api_token&api_token=" + variables.api_token
});
if (!resp.ok) {
console.log("Failed to get an OAuth2 token, status code was " + resp.status);
} else {
console.log("OAuth2 API TOKEN ok");
}
let access_token = resp.json().access_token;
// get current branch from activity object
let branch;
switch (activity.type) {
case 'environment.synchronize':
branch = activity.parameters.into;
break;
case 'environment.branch':
case 'environment.activate':
branch = activity.parameters.environment;
break;
}
// run runtime operation runtime_operation_name on current/targeted environment
resp = fetch("https://api.upsun.com/api/projects/" + activity.project + "/environments/" + branch + "/deployments/current/operations",
{
headers: {
"Authorization": "Bearer " + access_token
},
method: "POST",
body: JSON.stringify({"service": app_container, "operation": runtime_operation_name}),
});
if (!resp.ok) {
console.log("Failed to invoke the runtime operation, status code was " + resp.status);
} else {
console.log(runtime_operation_name + " launched");
}
}
This activity script uses an API Token, as an environment variable, to connect to the current environment and execute the previously defined runtime operation using the Upsun API. We need to define this environment variable for the integration of our activity script, and later add an API Token environment variable.
Then push your file to the main
branch and deploy, like so:
symfony checkout main
git add src/runtime/sanitize.js
git commit -m "add activity script"
symfony deploy
Three Upsun events should trigger this runtime operation:
- When creating a new branch (
environment.branch
), - When synchronization of data between environments occurs (
environment.synchronize
), - When activating a preview environment (
environment.activate
).
To implement these triggers, use this command in your terminal to add an activity script integration.
symfony integration:add --type script --file ./src/runtime/sanitize.js --events environment.branch,environment.synchronize,environment.activate --states complete --environments \*
Add an API Token environment variable
First, get the previous integration ID using the following command:
symfony integration:list
Then, create a new API Token from the Console, keep the value in your hand, and replace it in this terminal command:
symfony project:curl /integrations/<INTEGRATION_ID>/variables -X POST -d '{"name": "api_token", "value": "<API_TOKEN>", "is_sensitive": true, "is_json": false}'
You can verify that the variable has been created with this command:
symfony project:curl /integrations/<INTEGRATION_ID>/variables
Time to test
To test if everything has worked, in the Console or with the CLI, trigger the creation of a new branch from main
, trigger a sync, deactivate and reactivate your preview environment, and then you should see two activities:
- Activity triggered
- A runtime operation activity
Run into a problem? Debug it
If you encounter a problem and want to debug the activity script integration, you need to use the following command:
symfony integration:activity:log <INTEGRATION_ID>
When adding the integration of your activity script, the corresponding script is added in memory on the Upsun side. This means that each time you update your script, you need to update the cached version of the file, using the following command:
symfony integration:update <INTEGRATION_ID> --file ./src/runtime/sanitize.js
It’s possible to use a shell script to automate the data sanitization of all of your environments, except production, for all your projects within an organization–learn more about organizations here. To use this shell script, please ensure that all your environment sources from all your projects inside your organization contain the Symfony command to sanitize data, before working through the following steps.
The first step is to create a file named fleet_sanitizer.sh with the following code:
if [ -n "$ZSH_VERSION" ]; then emulate -L ksh; fi
######################################################
# fleet sanitization demo script, using the CLI.
#
# Enables the following workflow on a given project and sanitize preview environments (staging, new-feature and auto-updates environment:
# .
# └── main
# ├── staging
# | └── new-feature
# └── auto-updates
#
# Usage
# 1. source this script: `. fleet_sanitizer.sh` or `source fleet_sanitizer.sh` depending of your local machine
# 2. define ORGANIZATION var: ORGANIZATION=<organizationIdentifier>
# 3. run `sanitize_organization_data $ORGANIZATION`
######################################################
# Utility functions.
# list_org_projects: Print list of projects operation will be applied to before starting.
# $1: Organization, as it appears in console.upsun.com.
list_org_projects() {
symfony project:list -o $1 --columns="ID, Title"
}
# get_org_projects: Retrieve an array of project IDs for a given organization.
# Note: Makes array variable PROJECTS available to subsequent scripts.
# $1: Organization, as it appears in console.upsun.com.
get_org_projects() {
PROJECTS_LIST=$(symfony project:list -o $1 --pipe)
PROJECTS=($PROJECTS_LIST)
}
# get_project_envs: Retrieve an array of envs IDs for a project.
# Note: Makes array variable ENVS available to subsequent scripts.
# $1: ProjectId, as it appears in console.upsun.com.
get_project_envs() {
ENV_LIST=$(symfony environment:list -p $1 --pipe)
ENVS=($ENV_LIST)
}
# list_project_envs: Print list of envs operation will be applied to before starting.
# $1: ProjectId, as it appears in console.upsun.com.
list_project_envs() {
symfony environment:list -p $1
}
# add_env_var: Add environment level environment variable.
# $1: Variable name.
# $2: Variable value.
# $3: Target project ID.
# $4: Target environment ID.
add_env_var() {
VAR_STATUS=$(symfony project:curl -p $3 /environments/$4/variables/env:$1 | jq '.status')
if [ "$VAR_STATUS" != "null" ]; then
symfony variable:create --name $1 --value "$2" --prefix env: --project $3 --environment $4 --level environment --json false --sensitive false --visible-build true --visible-runtime true --enabled true --inheritable true -q
else
printf "\nVariable $1 already exists. Skipping."
fi
}
# Main functions.
sanitize_organization_data() {
list_org_projects $1
get_org_projects $1
for PROJECT in "${PROJECTS[@]}"; do
printf "\n### Project $PROJECT."
# get environments list
list_project_envs $PROJECT
get_project_envs $PROJECT
for ENVIRONMENT in "${ENVS[@]}"; do
unset -f ENV_CHECK
ENV_CHECK=$(symfony project:curl -p $PROJECT /environments/$ENVIRONMENT | jq -r '.status')
unset -f ENV_TYPE
ENV_TYPE=$(symfony project:curl -p $PROJECT /environments/$ENVIRONMENT | jq -r '.type')
if [ "$ENV_CHECK" = active -a "$ENV_TYPE" != production ]; then
unset -f DATA_SANITIZED
DATA_SANITIZED=$(symfony variable:get -p $PROJECT -e $ENVIRONMENT env:DATA_SANITIZED --property=value)
if [ "$DATA_SANITIZED" != true ]; then
printf "\nEnvironment $ENVIRONMENT exists and is not sanitized yet. Sanitizing data."
printf "\n"
# do sanitization here
symfony ssh -p $PROJECT -e $ENVIRONMENT -- php bin/console app:sanitize-data
printf "\nSanitizing data is finished, redeploying"
add_env_var DATA_SANITIZED true $PROJECT $ENVIRONMENT
else
printf "\nEnvironment $ENVIRONMENT exists and does not need to be sanitized. skipping."
fi
elif [ "$ENVIRONMENT" == main ]; then
printf "\nEnvironment $ENVIRONMENT is production one, skipping."
else
printf "\nEnvironment $ENVIRONMENT is not active $ENV_CHECK, skipping."
fi
done
done
}
Then, depending on the machine you want to run this script on, please adapt the code to your needs but it should look something like this:
. fleet_sanitizer.sh # or source fleet_sanitizer.sh
ORGANIZATION=<organizationIdentifier>
sanitize_organization_data $ORGANIZATION
And just like that, your data is sanitized and you're well on your way to GDPR compliance!
If you have any further questions about our security and compliance capabilities or encounter any issues with the methods and/or steps above, reach out to our support team who’ll be happy to help.
Stay up-to-date on all the latest from us over on our social media and community channels. Catch us over on Dev.to, Reddit, and Discord.