logo

FOSS4G 2022 | Manipulating text with PostgreSQL - lesser known PG jewels

FOSS4G

FOSS4G

10 followers

time1 yr agoview2 views

PostgreSQL is the most advanced opensource RDBMS. As GIS folks, you most probably use it in combination with PostGIS, its Geospatial plugin.

When dealing with Geospatial data, we usually focus on geometries. But most of feature attributes are text data. Of course, filtering on these text data with standard SQL capabilities is a day-to-day operation for database users.

But PostgreSQL provides much more capabilities when it comes down to text data management. In this presentation, we will go through a few of them.

After a quick look at standard text functions in PostgreSQL, we will discover the lesser known fuzzy matching modules :

  • pg_trgm extension allows for string searches using trigraphs to determine a similarity rank between text items
  • fuzzystrmatch extension provides fuzzy matching functions like soundex, Levenshtein, metaphone

Then, we will explore Full Text Search ( FTS ) PostgreSQL capabilities.

Last but not least, we will peek inside PostgreSQL collation concept, which has nothing to do with your lunch. Collations are a powerful feature in PostgreSQL allowing to adapt the way you deal with text data according to the localization. Like trying to answer this - apparently - obvious question : is '12' before or after '2' ?

And, because we can, display all of this on a map :-)

Vincent Picavet

https://talks.osgeo.org/foss4g-2022/talk/QEHLM7/

#foss4g2022 #generaltrack #UsecasesAndapplications

Loading comments...