Batch processing means doing a number of processes sequentially or simultaneously to save time and effort. If you have a proper data-set in a proper machine, batch processing lets you organize your tasks and achieve number of goals in one click. In this post I will go through few batch processing techniques you will find in ArcGIS.
In ArcGIS, doing things in batch can be done in a number of ways. The most common practice is to use the python scripting modules. Python is famous not only for it’s ease of code, but also it’s simplicity and power of data handling and garbage management system. ArcGIS also offers a very intuitive Model Maker tool to schematically simplify various geoprocessioning operations. Besides these two, each tool in ArcGIS has a ‘batch’ option in their option menu.
The Python way
ArcGIS let you use its own python modules and scripts to process a number of tasks. You can use both IDLE and ‘command line’ version of python coding environment. In ArcGIS there are a number of site package, arcpy is one of them. A module is a python file that generally includes functions and classes.
ArcPy is supported by a series of modules, including a mapping module (
arcpy.mapping), a Spatial Analyst module (
arcpy.sa), and a Geostatistical Analyst module (
ArcPy itself builds on (and is a successor to) the successful
arcgisscripting module. Its goal is to create the cornerstone for a useful and productive way to perform geographic data analysis, data conversion, data management, and map automation with Python.
arcpy, you have python’s core modules like
sys and also many third party modules and packages (ie.
Doing an in-depth tutorial of Python will not happen in this post, please try these links to learn the basics of Python.
- Official Python Tutorial
- LearnPythonTheHardWay by Zed A. Shaw
- Google’s Python Class
When you install a fresh copy of ArcGIS software, you install Python as a separate folder on your computer. You will find a link of IDLE, Python command line and other shortcuts in your Start menu. Besides these, ArcGIS also let you to access python directly from its main software window. In ArcGIS, you can also use Python from Field Calculator and custom Toolboxes. ArcGIS 10.0 typically installs Python 2.6 to
C:\Python26\ArcGIS10.0\ and 10.1 installs Python 2.7 to
C:\Python27\ArcGIS10.1\; you can use these paths to access Python from your favorite text editor or IDE. The best way to set the path for your system is,
Initially you will have 32bit version of Python, unless you install 64bit background geoprocessing with it. To allow other Python installations to access Arcpy a file must be copied from the
\Lib\site-packages\ folder within the ArcGIS Python installation and placed in the corresponding folder of the non-ArcGIS Python. If you have not installed 64-bit background geoprocessing the file is
Desktop10.1.pth; if you have installed it, the file is
Here is post describing all sort of tweaking of Python installation and runtime environment.
Arcpy provides access to geoprocessing tools as well as additional functions, classes, and modules that allow you to create simple or complex workflows. Broadly speaking,
arcpy is organized in tools, functions, classes, and modules. The
arcpy site-package is built closely upon Python 2.6 and requires this version (or later) to be successfully imported.
Consider that there are also no-tools function available within
arcpy. Here are some
Below is an example of Python script running in a Python window. The script runs the Buffer tool of 500 meter over a shape file called “river” and produces “river_500m_buffer”. In the Python window, your code is followed by >>> sign.
>>> import arcpy >>> results = arcpy.Buffer_analysis("rivers", "rivers_500m_buffer", "500 METERS") >>> print result C:/[Default locatio]/rivers_500m_buffer
Return the number of features
>>> result = arcpy.GetCount_management("streets_50m_of_rivers") >>> print result.getOutput(0)
Play some more.
>>> # Return a list of default spatial grid indexes for a feature class. >>> result = arcpy.CalculateDefaultGridIndex_management("streets_50m_of_rivers") >>> for i in range(0, result.outputCount): >>> print result.getOutput(i) >>> # Setting a cell size of a raster >>> if arcpy.env.cellSize != 30: >>> arcpy.env.cellSize = 30 >>> # Let's do something more with Buffer tool >>> import arcpy >>> arcpy.env.workspace = 'C:/some_location/example.gdb' >>> layer = 'City_Trails' >>> distances = ['100 meters', '200 meters', '400 meters'] >>> for dist in distances: >>> output = layer + '_' + dist.replace(' ', '_') >>> arcpy.Buffer_analysis(layer, output, dist)
arcpy.mapping is a package within arcpy usually used to manipulate the contents of existing map documents (
.mxd) and layer files (
.lyr) and also be used to automate map production. The software resource center has a tutorial page dedicated for arcpy.mapping package.
Spatial Analyst (
arcpy.sa) module gives you the access to Spatial Analyst functionalities, including tools, operators, functions, and classes, is to import from the sa module. Using thie import method it is possible to access Spatial Analyst functionality without providing a name space and imports overloaded operators, which allows rasters to be used with operators. Before using this, you will need to check out the Spatial Analyst license before running a tool.
Here is a typical example of automation in one block.
arcpy.env.workspace = ”C:/inputs” for fc in arcpy.ListFeatureClasses(): outfc = arcpy.Describe(fc).basename + “_Dissolved” arcpy.Dissolve_management(fc, outfc)
In the above code, the first line sets the input workspace. The second line loops through all the feature classes in the database. The third line gives a trailing ‘_Dissolved’ name of each output feature classes that is created in line four.
import arcpy.mapping mxd = mapping.MapDocument("c:/[document location]/MapDocument.mxd") lstBrokenDS = mapping.ListBrokenDataSources(mxd) for layer in lstBrokenDS: print layer.name
Above block lists all the broken data sources in an mxd. To find and replace the missing sources we will use the following code
import arcpy.mapping mxd = mapping.MapDocument("c:/[document location]/MapDocument.mxd") mxd.findAndReplaceWorkspacePaths("C:/[document location]/MapDocument.mxd", "C:/[New document location]/MapDocument.mxd")
findAndReplaceWorkspacePaths(), you can use
replaceWorkspaces() to replace workspace, and
replaceDataSource() to replace individual layers and table sources.
import arcpy.mapping mxd = mapping.MapDocument("c:/[Document_location]/MapDocument.mxd") # Get reference dataframe df = mapping.ListDataFrames(mxd,"dataframe") # Find the layer lyr = mapping.ListLayers(mxd,"theLayer",df) # Replace datasource as shapefile lyr.replaceDataSource("c:/[Document_location]","SHAPEFILE_WORKSPACE","theLayer") # Save a copy of the new document mxd.saveACopy("c:/[Document_location]/newMapDocument.mxd")
Here we have replaced only the first (the 0 index) layer from the array. To find all the broken datasources,
import arcpy.mapping import os path = "C:/[Document_location]" # Store the names of broken datasources f = open('BrokenDataList.txt','w') # Use os.walk() to walk through the datasources for root,dirs,files in os.walk(path): # For each file, use the os.path.splitext() method to obtain the base file name and extension for filename in files: basename, extension = os.path.splitext(filename) # Test the file extension, get the full path to the map document file, create a new map # document object instance using the path, write the map document name, loop through # each of the broken data sources, and write to a file if extension == ".mxd": fullPath = os.path.join(path,filename) mxd = mapping.MapDocument(fullPath) f.write("MXD: " + filename + "\n") brknList = mapping.ListBrokenDataSources(mxd) for brknItem in brknList: f.write("\t" + brknItem.name + "\n") f.close()
You can fix them by hand, or use
replaceDataSource() to do it.