Bash script for xml parsing

By | November 1, 2019

Usually XML processing is already implemented in programming languages. Several of them have numerous API for that. However some of these API are awkward and cumbersome. Recently it was necessary to parse xml file from bash script so I create my own parser. Basically what I create it is not inversion but small and it works for me, so I want to share it with you. The xml file sample is:


<?xml version="1.0" encoding="UTF-8"?>
<Customer 1>
   <First name>Jesus</First name>
   <Second name>Christ</Second name>
</Customer 1>
<Customer 2>
   <First name>Mary</First name>
   <Second name>Magdalene</Second name>
</Customer 2>
</xml>

The get_xml_node_value.sh script itself is:


#! /bin/bash
get_xml_node_value()
{
  startline=$(grep -n "<$2" $1 | cut -d':' -f1)
  lastline=$(grep -n "<\/$2>" $1 | cut -d':' -f1)
  value=$(sed -n $startline,$lastline'p' $1)
  value=$(echo $value | sed -e "s/^.*<$3/<$3/" | sed -e "s/<$3>//g" | sed -e "s/<\/$3>//g" | awk -F"<" '{print $1}' | sed -e 's/[[:space:]]*$//' )
}
if [ $# -eq 3 ]; then
  if [ -f $1 ]; then
    get_xml_node_value $1 "$2" "$3"
    echo $value
  else
    echo "File $1 not found"
  fi
else
  echo "3 arguments are required: xml file name, section name, node name"
  echo "Example:"
  echo "./get_xml_node_value abc.xml \"Customer 1\" \"First name\""
fi

Script testing results:


./get_xml_node_value.sh abc.xml "Customer 1" "First name"
Jesus
# ./get_xml_node_value.sh abc.xml "Customer 2" "Second name"
Magdalene

The node names in arguments are case sencitive.

One thought on “Bash script for xml parsing

  1. Sav O

    Thanks a lot for sharing!
    Parsing the XML with a pure bash is not the easiest way. There are fine tools like xmllint, but sometimes we meet limitations about the installations on the servers we have to work with.
    You have done a good approach that works even for more complicated XML structure as it was presented.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *