It’s been a while since I am writing again. This is not a very interesting topic to create a post with. I use to receive coordinates and name of locations from field staffs of different part of the country. I use to plot these locations over maps and create some reports. Pretty boring stuff! Most of the time these locations are neatly stuffed inside Microsoft word file, commonly in DOCX or XLS files. But recently they sent me 10 similar DOCX file with 128 locations listed. And I am going to create KML file with each of them using Python.

The DOCX files have several tables to organize different items. These tables are, unfortunately, have texts with coordinates of different locations in no strict fashion. There are also the name of the locations and a small descriptions. To create this task more subtle, the coordinates didn’t follow any strict DMS-standard, there are sometime DD in there too. The N and E are sometime put at the front, sometime they are at the end of the coordinates. So you see the Python string operations are useless.

I need to get into the DOCX (using the docx library), search each line of text inside each row of each table using regex pattern. If a suitable XY pattern is found, use this to create KML files. Each time a KML created, the name of the file will be a reference id (see the pic), the description will be also be added.

This is where it will start. I’ll wrap each DOCX file with a folder so that the KML files stay organized.

Now its time for the actual code.

I know I could have used a different library for creating KMLs, but I liked it this way. You can even use this to covert the coordinates and create something else.