Searching shortest path via PostgreSQL

Find shortest path/route via SQL, PostgreSQL to be exact.

Searching shortest path via PostgreSQL
Find shortest path via pgRouting

In summer 2021, our perfect startup team developed telegram bot, which allows you to find the shortest path between two instagram users. As a result you'll get a chain of users subscribed for each others. In this article, I'm going to tell you how we achieved it using pgRouting – extension for PostgreSQL.

To use this extension, you have to install it, just use following guide.

# 1. Enter your container with PostgreSQL or skip this step 
# if you have postgres server running on your system directly
dokku postgres:enter <service_name>  # dokku style
docker exec -it <container_name> bash  # docker style

# 2. Update apt packages
apt update && apt upgrade

# 3. Install pgRouting (for 12th version of postgresql)
apt install postgresql-12-pgrouting

# 4. Go to psql and enter your database
psql -U postgres <your_db_name>

# 5. Create extensions
CREATE EXTENSION PostGIS;
CREATE EXTENSION pgRouting;

# 6. Check version
SELECT PostGIS_Version();
SELECT pgr_version();

In our database we had user's relationships in following format (screenshot below). User from left column is in friendship with user from right column, i.e. they both subscribed for each other. By the way, we had about 4.2 million of such friendship edges 😯

To find shortest path between two user ids, we used pgRouting implementation of Dijkstra algorithm. You can find more examples of using on their page. In our case, we executed query below.

SELECT 
    ig_us.user_id, 
    ig_us.username, 
    ig_us.profile_pic_url, 
    row_number() over () as "order"
FROM 
    pgr_dijkstra(
        'SELECT 
            id, 
            user_id_1 as source, 
            user_id_2 as target, 
            1 as cost 
        FROM instagram_friendship',
        19010706, -- start user_id
        295372316, -- target user_id
        FALSE -- oriented graph
    ) pgr 
LEFT JOIN 
    instagram_user ig_us on pgr.node = ig_us.user_id;

To get result, which looks more prettier, we joined pgr_dijkstra result with instagram_user table. Below is final result of query, which is actually a path from shamaevnn to yurydud.

Path is found 🎉

With more than 200 million users in instagram_user table and about 4.2 million friendship edges it took about 7 seconds to find a path in average. We had thoughts about switching to neo4j (graph DB) for storing data and searching paths, but couldn't betray PostgreSQL ❤️, which we use almost everywhere.