Joining PL/pgSQL Results

Sometimes the Data You Need is Merely “Inferred” by Your Table Data

SQL is Not Dead Yet (But Sometimes Benefits From a Little Assistance)

For many people these days, SQL databases are “yesterday’s news.” NoSQL databases are definitely the “hot new thing” with many Web 2.0 developers. And, indeed, SQL is not a perfect language and there are definitely use cases for which a NoSQL approach might represent a better solution than an SQL database.

But, I would not be quick to write off SQL databases as some type of outdated relic of the past. In fact, there were numerous database systems that predated SQL and I, personally, used a handful of “NoSQL” databases before I learned SQL and possibly before the person who invented the term, “NoSQL,” was even born. A new job that required knowledge of Oracle v5 represented my introduction to SQL and I have been an unabashed fan ever since. Over the course of my career, I’ve worked with somewhere around 18 different languages and I can honestly say that SQL is one of my favorites.

Of course, SQL is not perfect—no langauge is—but I’ve always been impressed by both its clear design and the depth of problems it can address in a generally straightforward manner. The fact that it is a language in which you describe what you want instead of how to get it is one of its great strengths and I have often been surprised by the problems I can solve with relatively concise queries, especially when using a carefully designed database.

But, sometimes you actually need to be concerned with how your data is being processed on a row by row (record by record) basis. Often, this will be accomplished within the code of your application. Sometimes, though, this processing would be better placed closer to your database. In this case, it is worth considering PL/pgSQL, which I would guess is shorthand for “Procedural Language/PostgreSQL.” This language is clearly based on Oracle’s PL/SQL language, which provided a simple mechanism for wrapping SQL in an easy to learn procedural language while maintaining tight integration with the underlying database. There is a lot of shared syntax between PL/pgSQL and PL/SQL and if you can learn one, you can easily learn the other.

A Problem

humbl is currently working on a new Web application for facilitating scheduling within the context of community non-profit service provision. In a nutshell, it is being designed to connect people needing itinerant and intermittent services with people who provide them (who often do so on a part-time or even volunteer basis).

As such, our new application works a lot with time and time blocks, which can be tricky. The system tracks blocks of time during which Workers are available in the form of Offers. Those who receive services are known as Participants and they post Asks for services. When the time blocks of an Ask and an Offer overlap, that yields a Possibility from which an Appointment can be scheduled.

The trickiest part of this design is that as Appointments are made, the time blocks associated with Offers get “carved up.” As such, when Possibilities are calculated (which is done in “real time”), they must be based upon “derived” time blocks that remove from Offers the time used for Appointments.

My initial plan was to perform this calculation at the application level (specifically, within the server component). Upon reflection, however, I found that this task could be much more elegantly solved by using PL/pgSQL.

An Elegant Solution

One of the things I like about SQL is that while most of the data sources you specify in a FROM clause will be data tables, this does not have to be the case. You can also use a VIEW as a data source, or even another query. And, of course, the same data table can appear multiple times in the same query and be joined against itself.

In PostgreSQL, you can also use the results of a PL/pgSQL FUNCTION in a query’s FROM clause. This is precisely the solution I selected for this requirement.

Let’s recap the use case for this scenario. We are primarily interested in taking an Offer (which has a time range) and pulling out the time blocks that have an Appointment (which also have time ranges). Once this is done, there is a derived, or inferred, set of available Offer times. There is no direct way (at least that I can think of) to use raw SQL to derive that inferred set as one needs to “walk” through each Offer’s Appointments and calculate the time blocks between each of the latter. While this is not something that can be done in SQL alone, when it is mixed with some simple PL/pgSQL code, the task becomes trivial. As a bonus, the constructed FUNCTION becomes part of the database itself, meaning that it is accessible to queries and other database FUNCTIONS, too.

All PL/pgSQL FUNCTIONS must return a value and the type of that value must be defined when the FUNCTION is created. One of the types that can be returned is a SETOF database entities, such as a TABLE. Thus, the first task is to create the function with the appropriate returned database type. In our use case, the declaration would look something like the following.

CREATE OR REPLACE FUNCTION offers_inferred(offerId INTEGER)
RETURNS SETOF offers AS …

Next, we write our procedural code. My complete solution has roughly 90 lines in total (note that there is additional functionality in my production code as there are other “carve-outs” in our application).

We start with the DECLARE section, which essentially includes a RECORD that is used to hold the original Offer and is then modified as the “inferred” offers are calculated. We will also need a CURSOR to retrieve all of the Appointments, which are then processed through a loop. Finally, there are some “control” variables assigned as well. Here are the two “main” ingredients.

DECLARE
   offerRecord RECORD;
   appointmentsCursor CURSOR(cursorOfferId INTEGER) FOR
      SELECT …
   ;

The main part of the code then pulls in the Offer data.

SELECT *
INTO offerRecord
FROM offers
WHERE offer_id = offerId
;

We then execute the CURSOR and loop through the results. Within the loop we calculate, among other values, the proper start and end timestamps for the inferred data. As those are calculated, I replace the appropriate columns in the offerRecord with the inferred data. Once the new record is ready, it gets returned back to the caller by using the RETURN NEXT instruction. Note that the “NEXT” part differentiates the mechanism from a normal RETURN in that RETURN NEXT does not exit the FUNCTION whereas a basic RETURN would.

FOR appointmentRecord IN appointmentsCursor(offerId) LOOP
-- do your processing here
RETURN NEXT offerRecord;
END LOOP;

Once your PL/pgSQL FUNCTION is complete and tested, it can be used in a query just like any other data table. Here is how humbl’s forthcoming application uses this precise mechanism.

SELECT …
FROM offers a
JOIN offers_inferred(a.offer_id) b ON true
…

Note that the join condition is simply set as “true.” This is acceptable as we are passing in the offer_id, which would represent the proper join condition. Since it is being passed in, we can be assured that only the correct rows will be returned by our PL/pgSQL function with no need for more explicit join criteria.

Not only is this mechanism used in multiple queries in our application, thus showing its utility for multiple types of uses, but having it centrally defined in the database allows us to just modify the underlying FUNCTION when additional capabilities are needed. Again, humbl has added new “carve-outs” since this function was originally created. The new PL/pgSQL code was simply added and the new functionality immediately was available within our application.

Nice!

Written by Bill Sofko on May 30, 2022.

Your Name
E-Mail Address
Organization
Phone
Message

Joining PL/pgSQL Results

Sometimes the Data You Need is Merely “Inferred” by Your Table Data

SQL is Not Dead Yet (But Sometimes Benefits From a Little Assistance)

A Problem

An Elegant Solution

Contact Us