I asked this question Accessing external XML files as variables in a PSQL script (sourced from a bash script) on StackOverflow, and also posted my own answer.

Here is an extended, complete version of that issue. I additionally detail the scraping of the XML data from HMDB.ca metabolome files into PostgreSQL tables.

PRELIMINARIES

My data and file locations are:

[victoria@victoria hmdb]$ pwd
  /mnt/Vancouver/Programming/data/hmdb

[victoria@victoria hmdb]$ tree -L 2 -F
  .
  ├── bash/
  │   ├── hmdb.sh*
  ├── data/
  │   ├── hmdb_metabolites_5000-01.xml
  │   ├── hmdb_metabolites_5000-02.xml
  │   └── hmdb_metabolites_5000-03.xml
  └── sql/
    └── hmdb.sql

This next bit may seem complicated but really it’s quite simple.

My PostgreSQL data_directory (PSQL: SHOW data_directory;] is

/mnt/Vancouver/Programming/RDB/postgres/postgres/data

I added a symbolic link, such that

/mnt/Vancouver/Programming/RDB/postgres/postgres/data/hmdb

points to

/mnt/Vancouver/Programming/data/hmdb/

so that I can work with data outside the Postgres data directory without having to move my data there and chown it to postgres:postgres.

[Within that symlinked directory I can work with my files with the normal file ownership (victoria:victoria), which makes it much easier (IMHO) to work with those data.]

Thus, relative to the Postgres data directory,

/mnt/Vancouver/Programming/RDB/postgres/postgres/data/

the path to

hmdb_metabolites_5000-01.xml

is

hmdb/data/hmdb_metabolites_5000-01.xml

and I need to prefix those XML file names in the BASH script with hmdb/data/, which resolves the path from the Postgres data directory through my local (symlinked) directory to the XML data file(s).

[Otherwise, you can just move your data files to your PostgreSQL data directory, chown it to postgres:postgres, and adjust the paths above to point directly to the files.]


hmdb.sh

#!/bin/bash
# vim: syntax=sh

# /mnt/Vancouver/Programming/data/hmdb/bash/hmdb.sh

# "/mnt/Vancouver/Programming/RDB/postgres/postgres/data/hmdb"
# symlinks to "/mnt/Vancouver/Programming/data/hmdb/"
# so here you have to refer to
# "/mnt/Vancouver/Programming/data/hmdb/data/hmdb_metabolites_5000-01.xml"
# as "hmdb/data/hmdb_metabolites_5000-01.xml"
# (Relative to the Postgres data directory
# "/mnt/Vancouver/Programming/RDB/postgres/postgres/data/", the path to
#  "hmdb_metabolites_5000-01.xml" is "hmdb/data/hmdb_metabolites_5000-01.xml".) 

DATA_DIR=../data/
PGOPTIONS="-c client_min_messages=error" 
i=0

for file in $DATA_DIR/*.xml
  do
    if [ $i == 0 ]; then
      # printf "Hello!\n"
      psql postgres -c "select pg_terminate_backend(pid) from pg_stat_activity where datname='hmdb';"
      psql postgres -c "DROP DATABASE hmdb;"
      psql postgres -c "CREATE DATABASE hmdb;"
    fi
    ((++i))
    # bash_var=hmdb/$(echo ${file##*/})
    bash_var=hmdb/data/$(echo ${file##*/})
    echo $bash_var
    psql -d hmdb -v bash_var=$bash_var -f ../sql/hmdb.sql
done

psql hmdb -c "SELECT *, tableoid::regclass::text AS tablename FROM hmdb_identifiers ORDER BY accession DESC LIMIT 5;"
psql hmdb -c "SELECT count(id) from hmdb_identifiers;"

psql hmdb -c "SELECT *, tableoid::regclass::text AS tablename FROM hmdb_synonyms ORDER BY accession DESC LIMIT 5;"
psql hmdb -c "SELECT count(id) from hmdb_synonyms;"

# psql hmdb -c "SELECT *, tableoid::regclass::text AS tablename FROM hmdb_identifiers WHERE accession='hmdb0000208';"
# psql hmdb -c "SELECT *, tableoid::regclass::text AS tablename FROM hmdb_synonyms WHERE accession='hmdb0000208';"

# psql hmdb -c "SELECT *, tableoid::regclass::text AS tablename FROM hmdb_identifiers WHERE accession ILIKE 'HMDB0000208';"
# psql hmdb -c "SELECT *, tableoid::regclass::text AS tablename FROM hmdb_synonyms WHERE accession ILIKE 'HMDB0000208';"

psql hmdb -c "SELECT tableoid::regclass::text AS tablename, synonym from hmdb_synonyms ORDER BY length(synonym) DESC LIMIT 5;"

# Enable this as a audible alert for completion of long-running scripts:
for i in 1 2 3; do { aplay /mnt/Vancouver/Programming/scripts/PHASER.WAV ; sleep 0.25 ; } &> /dev/null; done

BASH script notes:

  • select pg_terminate_backend(pid) from pg_stat_activity where datname='hmdb'; disconnects the PSQL user (if connected) from the database hmdb, allowing the script to drop and (re)create that database, de novo.

  • $(echo ${file##*/}) returns the filename including extension (basename). Note the added hmdb/data/ prefix bit.

  • I also tried $i -eq 0, but this did not work (here)

  • ((++i)) increments i by one, in bash (more here)

  • psql -v <var>: variable assignment (see manpage: man psql | grep \\-d)

  • phaser.wav

hmdb.sql

\c hmdb

CREATE TABLE hmdb_identifiers (
  id SERIAL,
  accession VARCHAR(15) NOT NULL,
  name VARCHAR(300) NOT NULL,
  cas_number VARCHAR(12),
  pubchem_cid INT,
  PRIMARY KEY (id),
  UNIQUE (accession)
);

CREATE TABLE hmdb_synonyms (
  id SERIAL,
  accession VARCHAR(15) NOT NULL,
  synonym VARCHAR(300) NOT NULL,
  PRIMARY KEY (id),
  FOREIGN KEY (accession) REFERENCES hmdb_identifiers (accession)
    ON UPDATE CASCADE ON DELETE CASCADE,
  UNIQUE (accession, synonym)
);

-- \echo '\n[hmdb.sql] bash_var:' :bash_var '\n'

-- I tried doing this with one TEMP TABLE but I got duplicate values for otherwise
-- non-duplicate data (accession; cas_number; ...), due to the nature of unnesting
-- the XML data viz-a-viz the `synonyms` field. In doing so, with those data xpath
-- generated duplicate entries for accession, ... and consequently I was getting
-- `duplicate key value violates unique constraint 'hmdb_identifiers_accession_key'`
-- errors with the SQL INSERT statements.
-- The cost is those data (XML files) need to be read/parsed for each TEMP TABLE.

CREATE TEMP TABLE tmp_table AS 
SELECT 
  (xpath('//accession/text()', x))[1]::text::varchar(15) AS accession
  ,(xpath('//name/text()', x))[1]::text::varchar(300) AS name 
  ,(xpath('//cas_registry_number/text()', x))[1]::text::varchar(12) AS cas_number 
  ,(xpath('//pubchem_compound_id/text()', x))[1]::text::int AS pubchem_cid 
-- -- FROM unnest(xpath('//metabolite', XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('hmdb/hmdb.xml'), 'UTF8')))) x
FROM unnest(xpath('//metabolite', XMLPARSE(DOCUMENT convert_from(pg_read_binary_file(:'bash_var'), 'UTF8')))) x
;

CREATE TEMP TABLE tmp_table2 AS 
SELECT 
  (xpath('//accession/text()', x))[1]::text::varchar(15) AS accession
  ,unnest(xpath('//synonyms/synonym/text()', x))::text::varchar(300) AS synonym
FROM unnest(xpath('//metabolite', XMLPARSE(DOCUMENT convert_from(pg_read_binary_file(:'bash_var'), 'UTF8')))) x
;

INSERT INTO hmdb_identifiers (accession, name, cas_number, pubchem_cid)
  SELECT lower(accession), lower(name), lower(cas_number), pubchem_cid FROM tmp_table;

INSERT INTO hmdb_synonyms (accession, synonym)
  SELECT lower(accession), lower(synonym) FROM tmp_table2
  ON CONFLICT DO NOTHING;

-- https://stackoverflow.com/questions/38619072/how-to-replace-multiple-special-characters-in-postgres-9-5
UPDATE hmdb_synonyms SET synonym = REPLACE(synonym,'&gt;','>');
UPDATE hmdb_synonyms SET synonym = REPLACE(synonym,'&lt;','<');

-- TEST ~ MANUAL UPDATES:
INSERT INTO hmdb_identifiers (accession, name, cas_number, pubchem_cid)
  values (lower('HMDB0240268'), lower('Farnesyl acetate'), '4128-17-0', NULL);
INSERT INTO hmdb_synonyms (accession, synonym)
  values (lower('HMDB0240268'), lower('(2E,6E)-3,7,11-Trimethyldodeca-2,6,10-trien-1-yl acetate'));

DROP TABLE tmp_table;
DROP TABLE tmp_table2;

SQL script notes:

  • In the xpath statements I recast the ::text to (for example) ::text::varchar(15). I initially did this to recast the XML text for the PubChem numeric id to the Postgres integer datatype (::text::int), and realized it was probably good practice to explicitly recast all the data, per the Postgres table schema.

  • More significantly regarding the latter point, if I did NOT recast the datatype in the xpath statement, and the field entry (e.g. synonym name length) exceeded the SQL varchar(300) length limit, those data threw a PSQL error and the table did not update (i.e. was blank: no entries, anywhere!).

  • The

    UPDATE hmdb_synonyms SET synonym = REPLACE(synonym,'&gt;','>');

line updates entries like

[hmdb]# SELECT synonym from hmdb_synonyms where accession='hmdb0006669';

o-6-deoxy-alpha-l-galactopyranosyl-(1-&gt;3)-o-[o-6-deoxy-alpha-l-galactopyranosyl-(1-&gt;4)-o-

to

o-6-deoxy-alpha-l-galactopyranosyl-(1->3)-o-[o-6-deoxy-alpha-l-galactopyranosyl-(1->4)-o-
  • It appears that on rare occasion duplicate data conflict with one another, raising errors and stopping the script (no table update):

    psql:../sql/hmdb.sql:85: ERROR:  23505: duplicate key value violates unique constraint `hmdb_identifiers_accession_key`
    DETAIL:  Key (accession)=(hmdb0240268) already exists.
    SCHEMA NAME:  public
    TABLE NAME:  hmdb_identifiers
    CONSTRAINT NAME:  hmdb_identifiers_accession_key
    LOCATION:  _bt_check_unique, nbtinsert.c:434
    psql:../sql/hmdb.sql:87: ERROR:  23505: duplicate key value violates unique constraint `hmdb_synonyms_synonym_key`
    DETAIL:  Key (synonym)=((2e,6e)-3,7,11-trimethyldodeca-2,6,10-trien-1-yl acetate) already exists.
    SCHEMA NAME:  public
    TABLE NAME:  hmdb_synonyms
    CONSTRAINT NAME:  hmdb_synonyms_synonym_key
    LOCATION:  _bt_check_unique, nbtinsert.c:434

The solution was to add the ON CONFLICT DO NOTHING statement to the INSERT INTO hmdb_synonyms ... statement.

BASH script execution:

[victoria@victoria bash]$ ./hmdb.sh 
pg_terminate_backend 
----------------------

DROP DATABASE
CREATE DATABASE

hmdb/data/hmdb_metabolites_5000-01.xml
You are now connected to database "hmdb" as user "victoria".
CREATE TABLE
CREATE TABLE
SELECT 1
SELECT 13
INSERT 0 1
INSERT 0 13
UPDATE 13
UPDATE 13
INSERT 0 1
INSERT 0 1
DROP TABLE
DROP TABLE

hmdb/data/hmdb_metabolites_5000-02.xml
You are now connected to database "hmdb" as user "victoria".
psql:../sql/hmdb.sql:13: ERROR:  42P07: relation "hmdb_identifiers" already exists
LOCATION:  heap_create_with_catalog, heap.c:1067
psql:../sql/hmdb.sql:23: ERROR:  42P07: relation "hmdb_synonyms" already exists
LOCATION:  heap_create_with_catalog, heap.c:1067
SELECT 1
SELECT 12
INSERT 0 1
INSERT 0 12
UPDATE 26
UPDATE 26
psql:../sql/hmdb.sql:85: ERROR:  23505: duplicate key value violates unique constraint "hmdb_identifiers_accession_key"
DETAIL:  Key (accession)=(hmdb0240268) already exists.
SCHEMA NAME:  public
TABLE NAME:  hmdb_identifiers
CONSTRAINT NAME:  hmdb_identifiers_accession_key
LOCATION:  _bt_check_unique, nbtinsert.c:434
psql:../sql/hmdb.sql:87: ERROR:  23505: duplicate key value violates unique constraint "hmdb_synonyms_accession_synonym_key"
DETAIL:  Key (accession, synonym)=(hmdb0240268, (2e,6e)-3,7,11-trimethyldodeca-2,6,10-trien-1-yl acetate) already exists.
SCHEMA NAME:  public
TABLE NAME:  hmdb_synonyms
CONSTRAINT NAME:  hmdb_synonyms_accession_synonym_key
LOCATION:  _bt_check_unique, nbtinsert.c:434
DROP TABLE
DROP TABLE

hmdb/data/hmdb_metabolites_5000-03.xml
You are now connected to database "hmdb" as user "victoria".
psql:../sql/hmdb.sql:13: ERROR:  42P07: relation "hmdb_identifiers" already exists
LOCATION:  heap_create_with_catalog, heap.c:1067
psql:../sql/hmdb.sql:23: ERROR:  42P07: relation "hmdb_synonyms" already exists
LOCATION:  heap_create_with_catalog, heap.c:1067
SELECT 1
SELECT 43
INSERT 0 1
INSERT 0 43
UPDATE 69
UPDATE 69
psql:../sql/hmdb.sql:85: ERROR:  23505: duplicate key value violates unique constraint "hmdb_identifiers_accession_key"
DETAIL:  Key (accession)=(hmdb0240268) already exists.
SCHEMA NAME:  public
TABLE NAME:  hmdb_identifiers
CONSTRAINT NAME:  hmdb_identifiers_accession_key
LOCATION:  _bt_check_unique, nbtinsert.c:434
psql:../sql/hmdb.sql:87: ERROR:  23505: duplicate key value violates unique constraint "hmdb_synonyms_accession_synonym_key"
DETAIL:  Key (accession, synonym)=(hmdb0240268, (2e,6e)-3,7,11-trimethyldodeca-2,6,10-trien-1-yl acetate) already exists.
SCHEMA NAME:  public
TABLE NAME:  hmdb_synonyms
CONSTRAINT NAME:  hmdb_synonyms_accession_synonym_key
LOCATION:  _bt_check_unique, nbtinsert.c:434

DROP TABLE
DROP TABLE

id |  accession  |        name        | cas_number | pubchem_cid |    tablename     
----+-------------+--------------------+------------+-------------+------------------
  2 | hmdb0240268 | farnesyl acetate   | 4128-17-0  |      [NULL] | hmdb_identifiers
  5 | hmdb0000005 | 2-ketobutyric acid | 600-18-0   |          58 | hmdb_identifiers
  3 | hmdb0000002 | 1,3-diaminopropane | 109-76-2   |         428 | hmdb_identifiers
  1 | hmdb0000001 | 1-methylhistidine  | 332-80-9   |       92105 | hmdb_identifiers

count 
-------
    4

id |  accession  |                         synonym                          |   tablename   
----+-------------+----------------------------------------------------------+---------------
14 | hmdb0240268 | (2e,6e)-3,7,11-trimethyldodeca-2,6,10-trien-1-yl acetate | hmdb_synonyms
46 | hmdb0000005 | α-oxo-n-butyric acid                                     | hmdb_synonyms
45 | hmdb0000005 | α-oxo-n-butyrate                                         | hmdb_synonyms
41 | hmdb0000005 | α-ketobutyric acid                                       | hmdb_synonyms
40 | hmdb0000005 | α-ketobutyrate                                           | hmdb_synonyms

count 
-------
    69

  tablename   |                         synonym                          
---------------+----------------------------------------------------------
hmdb_synonyms | (2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoic acid
hmdb_synonyms | (2e,6e)-3,7,11-trimethyldodeca-2,6,10-trien-1-yl acetate
hmdb_synonyms | (2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoate
hmdb_synonyms | trimethylenediamine dihydrochloride
hmdb_synonyms | alpha-ketobutyric acid, sodium salt

[victoria@victoria bash]$

PSQL

After running the BASH script, if you were previously connected to the hmdb database and you run a query, you’ll get a warning about being disconnected; e.g.,

[hmdb]# SELECT * from hmdb_identifiers;

FATAL:  57P01: terminating connection due to administrator command
LOCATION:  ProcessInterrupts, postgres.c:2884
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

[hmdb]#

You can either:

  • rerun that PSQL query (as you are now reconnected);

  • manually reconnect, \c hmdb; then run your query; or

  • execute this command (note the preceding semicolon; you will get the error message, but this is a quick one-liner):

    ;SELECT * from hmdb_identifiers;

PSQL SCRIPT EXECUTION:

In any event, here is the PSQL output, after running the BASH script (note that due to the UTF8 specification in the PSQL script, we get Greek letters properly displayed, e.g. α-ketoglutarate | α-keto-δ-guanidinovalerate | α-oxo-N-butyric acid | ...):


[hmdb]# \c hmdb;
You are now connected to database "hmdb" as user "victoria".

[hmdb]# SELECT * from hmdb_identifiers LIMIT 5;

id  accession         name        cas_number pubchem_cid 
-- ----------- ------------------ ---------- -----------
1 hmdb0000001 1-methylhistidine  332-80-9         92105
2 hmdb0240268 farnesyl acetate   4128-17-0       [NULL]
3 hmdb0000002 1,3-diaminopropane 109-76-2           428
5 hmdb0000005 2-ketobutyric acid 600-18-0            58

[hmdb]# SELECT * from hmdb_synonyms LIMIT 5;

id  accession                          synonym                          
-- ----------- --------------------------------------------------------
1 hmdb0000001 (2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoic acid
2 hmdb0000001 1-methylhistidine
3 hmdb0000001 pi-methylhistidine
4 hmdb0000001 (2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoate
5 hmdb0000001 1 methylhistidine

[hmdb]# SELECT synonym from hmdb_synonyms ORDER BY length(synonym) DESC LIMIT 5;

                        synonym                          
--------------------------------------------------------
(2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoic acid
(2e,6e)-3,7,11-trimethyldodeca-2,6,10-trien-1-yl acetate
(2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoate
trimethylenediamine dihydrochloride
alpha-ketobutyric acid, sodium salt

[hmdb]# SELECT * from hmdb_synonyms ORDER BY accession DESC LIMIT 4;

id  accession                          synonym                          
-- ----------- --------------------------------------------------------
14 hmdb0240268 (2e,6e)-3,7,11-trimethyldodeca-2,6,10-trien-1-yl acetate
29 hmdb0000005 2-oxobutyric acid
28 hmdb0000005 2-ketobutanoic acid
30 hmdb0000005 3-methyl pyruvic acid

[hmdb]# DELETE FROM hmdb_identifiers WHERE accession='hmdb0240268';
DELETE 1

[hmdb]# SELECT count(id) from hmdb_synonyms;

count 
-----
  68

[hmdb]# SELECT * from hmdb_synonyms ORDER BY accession DESC LIMIT 3;

id  accession         synonym        
-- ----------- ---------------------
28 hmdb0000005 2-ketobutanoic acid
29 hmdb0000005 2-oxobutyric acid
30 hmdb0000005 3-methyl pyruvic acid

[hmdb]# SELECT * from hmdb_synonyms WHERE synonym ILIKE '%lpha-keto%';

id  accession                synonym               
-- ----------- -----------------------------------
31 hmdb0000005 alpha-ketobutyrate
32 hmdb0000005 alpha-ketobutyric acid
61 hmdb0000005 alpha-keto-n-butyrate
62 hmdb0000005 alpha-keto-n-butyric acid
63 hmdb0000005 alpha-ketobutric acid
70 hmdb0000005 alpha-ketobutyric acid, sodium salt


## DATA INGRESS CHECK:
## ===================

[hmdb]# SELECT * from hmdb_synonyms as synonym where accession ILIKE 'hmdb0106899';

  id    accession                                                  synonym                                                 
------ ----------- -------------------------------------------------------------------------------------------------------
923304 hmdb0106899 (2s)-1-[(15-methylhexadecanoyl)oxy]-3-[(12-methyltridecanoyl)oxy]propan-2-yl 17-methyloctadecanoic acid
923305 hmdb0106899 1-isoheptadecanoyl-2-isononadecanoyl-3-isotetradecanoyl-glycerol
923306 hmdb0106899 tg(i-17:0/i-19:0/i-14:0)
923307 hmdb0106899 triacylglycerol
923308 hmdb0106899 triglyceride
923309 hmdb0106899 tg(50:0)
923310 hmdb0106899 tracylglycerol(50:0)
923311 hmdb0106899 tag(50:0)
923312 hmdb0106899 tracylglycerol(i-17:0/i-19:0/i-14:0)
923313 hmdb0106899 tag(i-17:0/i-19:0/i-14:0)

[hmdb]#

## COMPARE THOSE DATA (ABOVE) WITH THE SOURCE DATA FILE DATA (not uploaded):
## hmdb_metabolites-07.xml :

[victoria@victoria data]$ dpl

  Wed May  2 19:46:53 PDT 2018
  /mnt/Vancouver/Programming/data/hmdb/data

  total 3606796
  -rw-r--r-- 1 victoria victoria 500008714 Apr 25 15:32 hmdb_metabolites-01.xml
  -rw-r--r-- 1 victoria victoria 500033500 Apr 25 15:33 hmdb_metabolites-02.xml
  -rw-r--r-- 1 victoria victoria 500012180 Apr 25 15:34 hmdb_metabolites-03.xml
  -rw-r--r-- 1 victoria victoria 500013231 Apr 25 15:35 hmdb_metabolites-04.xml
  -rw-r--r-- 1 victoria victoria 500013021 Apr 25 15:37 hmdb_metabolites-05.xml
  -rw-r--r-- 1 victoria victoria 500005018 Apr 25 15:38 hmdb_metabolites-06.xml
  -rw-r--r-- 1 victoria victoria 500020416 Apr 25 15:39 hmdb_metabolites-07.xml
  -rw-r--r-- 1 victoria victoria 193200978 Apr 25 15:40 hmdb_metabolites-08.xml

[victoria@victoria data]$ rg . -i -A25 -e "hmdb0106899"

  hmdb_metabolites-07.xml
  948166:  <accession>HMDB0106899</accession>
  948167-  <status>expected</status>
  948168-  <secondary_accessions>
  948169-  </secondary_accessions>
  948170-  <name>TG(i-17:0/i-19:0/i-14:0)</name>
  948171-  <description>TG(i-17:0/i-19:0/i-14:0) belongs to the family of triradyglycerols, which are glycerolipids lipids containing a common glycerol backbone to which at least one fatty acyl group is esterified. Their general formula is [R1]OCC(CO[R2])O[R3]. TG(i-17:0/i-19:0/i-14:0) is made up of one 15-methylhexadecanoyl(R1), one 17-methyloctadecanoyl(R2), and one 12-methyltridecanoyl(R3).</description>

  948172-  <synonyms>
  948173-    <synonym>(2S)-1-[(15-Methylhexadecanoyl)oxy]-3-[(12-methyltridecanoyl)oxy]propan-2-yl 17-methyloctadecanoic acid</synonym>
  948174-    <synonym>1-isoheptadecanoyl-2-isononadecanoyl-3-isotetradecanoyl-glycerol</synonym>
  948175-    <synonym>TG(i-17:0/i-19:0/i-14:0)</synonym>
  948176-    <synonym>Triacylglycerol</synonym>
  948177-    <synonym>Triglyceride</synonym>
  948178-    <synonym>TG(50:0)</synonym>
  948179-    <synonym>Tracylglycerol(50:0)</synonym>
  948180-    <synonym>TAG(50:0)</synonym>
  948181-    <synonym>Tracylglycerol(i-17:0/i-19:0/i-14:0)</synonym>
  948182-    <synonym>TAG(i-17:0/i-19:0/i-14:0)</synonym>
  948183-  </synonyms>

  948184-  <chemical_formula>C53H102O6</chemical_formula>
  948185-  <average_molecular_weight>835.393</average_molecular_weight>
  948186-  <monisotopic_molecular_weight>834.767641004</monisotopic_molecular_weight>
  948187-  <iupac_name>(2S)-1-[(15-methylhexadecanoyl)oxy]-3-[(12-methyltridecanoyl)oxy]propan-2-yl 17-methyloctadecanoate</iupac_name>
  948188-  <traditional_iupac>(2S)-1-[(15-methylhexadecanoyl)oxy]-3-[(12-methyltridecanoyl)oxy]propan-2-yl 17-methyloctadecanoate</traditional_iupac>
  948189-  <cas_registry_number/>
  948190-  <smiles>[H][C@](COC(=O)CCCCCCCCCCCCCC(C)C)(COC(=O)CCCCCCCCCCC(C)C)OC(=O)CCCCCCCCCCCCCCCC(C)C</smiles>
  948191-  <inchi>InChI=1S/C53H102O6/c1-47(2)39-33-27-21-15-11-8-7-9-13-18-26-32-38-44-53(56)59-50(46-58-52(55)43-37-31-25-20-19-23-29-35-41-49(5)6)45-57-51(54)42-36-30-24-17-14-10-12-16-22-28-34-40-48(3)4/h47-50H,7-46H2,1-6H3/t50-/m0/s1</inchi>

[victoria@victoria data]$ 

[hmdb]# \l

                             List of databases
   Name     Owner   Encoding   Collate      Ctype      Access privileges   
---------- -------- -------- ----------- ----------- ---------------------
hmdb       victoria UTF8     en_US.UTF-8 en_US.UTF-8 
metabolism victoria UTF8     en_US.UTF-8 en_US.UTF-8 
ncbi       victoria UTF8     en_US.UTF-8 en_US.UTF-8 
postgres   postgres UTF8     en_US.UTF-8 en_US.UTF-8 
template0  postgres UTF8     en_US.UTF-8 en_US.UTF-8 =c/postgres          +
                                                     postgres=CTc/postgres
template1  postgres UTF8     en_US.UTF-8 en_US.UTF-8 =c/postgres          +
                                                     postgres=CTc/postgres
zzz        victoria UTF8     en_US.UTF-8 en_US.UTF-8 

[hmdb]# \dt

           List of relations
Schema       Name        Type   Owner   
------ ----------------- ----- --------
public hmdb_identifiers  table victoria
public hmdb_synonyms     table victoria

[hmdb]# \d hmdb_identifiers

                                  Table "public.hmdb_identifiers"
  Column             Type          Collation Nullable                   Default                    
----------- ---------------------- --------- -------- --------------------------------------------
id          integer                          not null nextval('hmdb_identifiers_id_seq'::regclass)
accession   character varying(15)            not null 
name        character varying(300)           not null 
cas_number  character varying(12)                     
pubchem_cid integer                                   
Indexes:
    "hmdb_identifiers_pkey" PRIMARY KEY, btree (id)
    "hmdb_identifiers_accession_key" UNIQUE CONSTRAINT, btree (accession)
Referenced by:
    TABLE "hmdb_synonyms" CONSTRAINT "hmdb_synonyms_accession_fkey" FOREIGN KEY (accession) REFERENCES hmdb_identifiers(accession) ON UPDATE CASCADE ON DELETE CASCADE

[hmdb]# \d hmdb_synonyms

                                 Table "public.hmdb_synonyms"
 Column            Type          Collation Nullable                  Default                  
--------- ---------------------- --------- -------- -----------------------------------------
id        integer                          not null nextval('hmdb_synonyms_id_seq'::regclass)
accession character varying(15)            not null 
synonym   character varying(300)           not null 
Indexes:
    "hmdb_synonyms_pkey" PRIMARY KEY, btree (id)
    "hmdb_synonyms_accession_synonym_key" UNIQUE CONSTRAINT, btree (accession, synonym)
Foreign-key constraints:
    "hmdb_synonyms_accession_fkey" FOREIGN KEY (accession) REFERENCES hmdb_identifiers(accession) ON UPDATE CASCADE ON DELETE CASCADE

[hmdb]# SELECT * from hmdb_identifiers LIMIT 5;

id  accession          name          cas_number pubchem_cid 
-- ----------- --------------------- ---------- -----------
 1 hmdb0000001 1-methylhistidine     332-80-9         92105
 2 hmdb0000002 1,3-diaminopropane    109-76-2           428
 3 hmdb0000005 2-ketobutyric acid    600-18-0            58
 4 hmdb0000008 2-hydroxybutyric acid 600-15-7         11266
 5 hmdb0000010 2-methoxyestrone      362-08-3        440624

[hmdb]# SELECT * from hmdb_synonyms LIMIT 5;

 id   accession          synonym          
---- ----------- ------------------------
4117 hmdb0000182 2,6-diaminohexanoic acid
4118 hmdb0000182 6-amino-aminutrin
4119 hmdb0000182 6-amino-l-norleucine
4120 hmdb0000182 a-lysine
4121 hmdb0000182 alpha-lysine

[hmdb]# SELECT i.accession, i.name, s.synonym FROM hmdb_identifiers as i RIGHT JOIN hmdb_synonyms AS s ON i.accession = s.accession LIMIT 50;

 accession         name                                synonym                          
----------- ------------------ --------------------------------------------------------
hmdb0000001 1-methylhistidine  1 methylhistidine
hmdb0000001 1-methylhistidine  1-methylhistidine
hmdb0000001 1-methylhistidine  1-methyl histidine
hmdb0000001 1-methylhistidine  1-methyl-histidine
hmdb0000001 1-methylhistidine  1-methylhistidine dihydrochloride
hmdb0000001 1-methylhistidine  1-methyl-l-histidine
hmdb0000001 1-methylhistidine  1-mhis
hmdb0000001 1-methylhistidine  1-n-methyl-l-histidine
hmdb0000001 1-methylhistidine  (2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoate
hmdb0000001 1-methylhistidine  (2s)-2-amino-3-(1-methyl-1h-imidazol-4-yl)propanoic acid
hmdb0000001 1-methylhistidine  l-1-methylhistidine
hmdb0000001 1-methylhistidine  n1-methyl-l-histidine
hmdb0000001 1-methylhistidine  pi-methylhistidine
hmdb0000002 1,3-diaminopropane 1,3-diaminepropane
hmdb0000002 1,3-diaminopropane 1,3-diamino-n-propane
hmdb0000002 1,3-diaminopropane 1,3-propanediamine
hmdb0000002 1,3-diaminopropane 1,3-propylenediamine
hmdb0000002 1,3-diaminopropane 1,3-trimethylenediamine
hmdb0000002 1,3-diaminopropane 3-aminopropylamine
hmdb0000002 1,3-diaminopropane a,w-propanediamine
hmdb0000002 1,3-diaminopropane propane-1,3-diamine
hmdb0000002 1,3-diaminopropane tn
hmdb0000002 1,3-diaminopropane trimethylenediamine
hmdb0000002 1,3-diaminopropane trimethylenediamine dihydrochloride
hmdb0000002 1,3-diaminopropane trimethylenediamine hydrochloride
hmdb0000005 2-ketobutyric acid 2-ketobutanoate
hmdb0000005 2-ketobutyric acid 2-ketobutanoic acid
hmdb0000005 2-ketobutyric acid 2-ketobutyrate
hmdb0000005 2-ketobutyric acid 2-oxobutanoate
hmdb0000005 2-ketobutyric acid 2-oxo-butanoate
hmdb0000005 2-ketobutyric acid 2-oxo-butanoic acid
hmdb0000005 2-ketobutyric acid 2-oxobutanoic acid
hmdb0000005 2-ketobutyric acid 2-oxobutyrate
hmdb0000005 2-ketobutyric acid 2-oxo-butyrate
hmdb0000005 2-ketobutyric acid 2-oxo-butyric acid
hmdb0000005 2-ketobutyric acid 2-oxobutyric acid
hmdb0000005 2-ketobutyric acid 2-oxo-n-butyrate
hmdb0000005 2-ketobutyric acid 2-oxo-n-butyric acid
hmdb0000005 2-ketobutyric acid 3-methylpyruvate
hmdb0000005 2-ketobutyric acid 3-methyl pyruvate
hmdb0000005 2-ketobutyric acid 3-methyl pyruvic acid
hmdb0000005 2-ketobutyric acid 3-methylpyruvic acid
hmdb0000005 2-ketobutyric acid a-ketobutyrate
hmdb0000005 2-ketobutyric acid a-ketobutyric acid
hmdb0000005 2-ketobutyric acid a-keto-n-butyrate
hmdb0000005 2-ketobutyric acid a-keto-n-butyric acid
hmdb0000005 2-ketobutyric acid alpha-ketobutric acid
hmdb0000005 2-ketobutyric acid alpha-ketobutyrate
hmdb0000005 2-ketobutyric acid alpha-ketobutyric acid
hmdb0000005 2-ketobutyric acid alpha-ketobutyric acid, sodium salt

[hmdb]# SELECT i.accession, i.name, s.synonym FROM hmdb_identifiers as i RIGHT JOIN hmdb_synonyms AS s ON i.accession = s.accession WHERE s.synonym ILIKE '%ketobut%' LIMIT 25;

 accession               name                                      synonym                         
----------- ------------------------------- ------------------------------------------------------
hmdb0001553 2-oxo-4-methylthiobutanoic acid 4-methylthio-2-ketobutanoate
hmdb0001553 2-oxo-4-methylthiobutanoic acid 4-methylthio-2-ketobutanoic acid
hmdb0001553 2-oxo-4-methylthiobutanoic acid 4-methylthio-2-ketobutyrate
hmdb0001553 2-oxo-4-methylthiobutanoic acid s-methyl-alpha-ketobutyric acid
hmdb0001553 2-oxo-4-methylthiobutanoic acid 4-methylthio-2-ketobutyric acid
hmdb0003407 diacetyl                        2,3-diketobutane
hmdb0003407 diacetyl                        diketobutane
hmdb0033851 4,4-dimethoxy-2-butanone        3-ketobutyraldehyde dimethyl acetal
hmdb0003771 2-methylacetoacetic acid        2-methyl-3-ketobutyric acid
hmdb0031204 4-hydroxy-2-oxobutanoic acid    g-hydroxy-a-ketobutyric acid
hmdb0006454 l-2-amino-3-oxobutanoic acid    2-amino-3-ketobutyric acid
hmdb0006454 l-2-amino-3-oxobutanoic acid    2-amino-3-ketobutyrate
hmdb0000005 2-ketobutyric acid              a-ketobutyrate
hmdb0033698 acetoin acetate                 2-ketobutan-3-yl acetate
hmdb0033851 4,4-dimethoxy-2-butanone        3-ketobutyraldehyde dimethylacetal
hmdb0036396 3-methylbutyl 3-oxobutanoate    3-methylbutyl beta-ketobutyrate
hmdb0036230 1-methyl-2-oxopropyl butyrate   2-ketobutan-3-yl butanoate
hmdb0036396 3-methylbutyl 3-oxobutanoate    isoamyl beta-ketobutyrate
hmdb0036396 3-methylbutyl 3-oxobutanoate    isopentyl beta-ketobutyrate
hmdb0038256 geranyl acetoacetate            geranyl beta-ketobutyrate
hmdb0039324 xi-3-hydroxy-2-oxobutanoic acid 3-hydroxy-alpha-ketobutyrate
hmdb0039324 xi-3-hydroxy-2-oxobutanoic acid beta-hydroxy-alpha-ketobutyric acid
hmdb0041914 ketophenylbutazone              1,2-diphenyl-4-(gamma-ketobutyl)-3,5-pyrazolidinedione
hmdb0041914 ketophenylbutazone              1,2-diphenyl-4-(g-ketobutyl)-3,5-pyrazolidinedione
hmdb0041914 ketophenylbutazone              1,2-diphenyl-4-(γ-ketobutyl)-3,5-pyrazolidinedione

[hmdb]# SELECT i.accession, i.name, s.synonym FROM hmdb_identifiers as i JOIN hmdb_synonyms AS s ON i.accession = s.accession WHERE s.synonym ILIKE '%α-keto%' LIMIT 25;

 accession             name                       synonym             
----------- -------------------------- ------------------------------
hmdb0000005 2-ketobutyric acid         α-ketobutyrate
hmdb0000005 2-ketobutyric acid         α-ketobutyric acid
hmdb0000019 alpha-ketoisovaleric acid  α-ketoisovalerate
hmdb0000019 alpha-ketoisovaleric acid  α-ketoisovaleric acid
hmdb0000019 alpha-ketoisovaleric acid  α-keto-isovalerate
hmdb0000019 alpha-ketoisovaleric acid  α-keto-isovaleric acid
hmdb0000019 alpha-ketoisovaleric acid  α-ketovaline
hmdb0000119 glyoxylic acid             α-ketoacetate
hmdb0000119 glyoxylic acid             α-ketoacetic acid
hmdb0000695 ketoleucine                α-ketoisocaproate
hmdb0000205 phenylpyruvic acid         α-ketohydrocinnamate
hmdb0000205 phenylpyruvic acid         α-ketohydrocinnamic acid
hmdb0000491 3-methyl-2-oxovaleric acid α-keto-β-methyl-n-valerate
hmdb0000491 3-methyl-2-oxovaleric acid α-keto-β-methyl-n-valeric acid
hmdb0000491 3-methyl-2-oxovaleric acid α-keto-β-methylvalerate
hmdb0000491 3-methyl-2-oxovaleric acid α-keto-β-methylvaleric acid
hmdb0000208 oxoglutaric acid           α-ketoglutarate
hmdb0000208 oxoglutaric acid           α-ketoglutaric acid
hmdb0000243 pyruvic acid               α-ketopropionate
hmdb0000243 pyruvic acid               α-ketopropionic acid
hmdb0000695 ketoleucine                α-ketoisocaproic acid
hmdb0031522 mesoxalic acid             α-ketomalonate
hmdb0001587 phenylglyoxylic acid       α-ketophenylacetate
hmdb0001587 phenylglyoxylic acid       α-ketophenylacetic acid
hmdb0004225 2-oxoarginine              α-keto-δ-guanidinovalerate

[hmdb]# 

DATA:

I uploaded the XML data files used in this answer at this Gist

Direct links: