Title: Using Free and Open Source GIS to Automatically Create
Standards-Based Spatial Metadata in Academic – First Investigations

Length: 7 955 words, 26 pages

Stated objective of the article The authors aim to present preliminary
work done on an approach to metadata automation in an academic context,
making use of QGIS and PostGIS. They describe how the creation of 18 of
the 20 INSPIRE mandatory metadata elements can be automated in an
approach where the data and metadata are tightly coupled, allowing GIS
interoperability.

Title The title accurately reflects the content of the paper, except
that it is not clear why the title and this work are restricted to an
academic context. Refer also to the comment under ‘Review’ below.

Abstract Similar to the title, the abstract accurately reflects the
content of the paper, except that it is not clear why this work is
restricted to an academic context. Refer also to the comment under
‘Review’ below.

Keywords I would suggest to add ‘automation’ and/or ‘metadata
automation’ to the list of keywords.

Review Metadata generation and maintenance remain a challenge for which
solutions need to be sought. This paper presents first investigations
into a novel approach for metadata automation. The authors provide a
logical justification for the research that refers to relevant
literature. The paper is interesting and relevant to the FOSS4G2013
target audience, because open source tools are used. The paper is
equally relevant to the wider geospatial community.

The stated objective of the paper is mostly met. The authors describe
how they automated the creation of 18 of the 20 INSPIRE mandatory
metadata elements. The data and metadata are tightly coupled in that
they are stored in the same database, but it is not clear whether the
workflow is tightly coupled. The authors refer to the ‘tightly coupled’
characteristic in two ways: tightly coupled in terms of storage (is the
data and metadata integrated?), as well as tightly coupled in terms of
workflow (is the metadata updated as part of the spatial editing
workflow?). The approach described in the paper is definitely tightly
coupled in terms of storage, but there is not enough information to
evaluate whether the workflow is also tightly coupled. For example, when
will the keywords be updated?

The claim that the approach described in the paper is interoperable
needs to be better qualified in the abstract, introduction and
conclusion. The definition for interoperability used in the ISO 19100
series of standards is ‘capability to communicate, execute programs, or
transfer data among various functional units in a manner that requires
the user to have little or no knowledge of the unique characteristics of
those units’ (ISO/IEC 2382-1:1993). As correctly reflected in Figure 5:
the metadata in the presented approach is accessible and searchable from
different GIS packages (i.e. syntactic interoperability), but the other
packages do not ‘understand’ it, so the metadata cannot be updated (i.e.
no semantic interoperability).

The authors describe their work as applying to an academic context, but
this academic context is not described; there is no information to
explain why the academic context is different to others? For example, in
section 3.2 they explain that a significant number of metadata elements
may be automated in an academic context, but they do not justify why
this is different from other contexts. I could only find information in
the last paragraph of the paper about the difference of an academic
context, i.e. adding work package information. My suggestion would be to
remove the constraint of an academic context from the title, abstract
and paper; the work that is described could equally well be applied and
used in a non-academic context.

The keyword generation and dataset language detection described in
section 5 are probably the most interesting automation features that the
authors have implemented. These should be reflected in the objectives of
the paper.

The title seems to suggest that there is a specific reason why FOSS4G
was used, but there is no justification in the paper. For example, were
the free PostGIS and QGIS installations a motivation or the fact that it
is easy to write plug-ins, which may then be distributed freely with the
software installation? This justification would also be interesting for
the FOSS4G2013 target audience.

Finally, making use of triggers is an interesting tightly coupled
approach, but triggers have their drawbacks and the authors should
acknowledge this. For example, what will happen if a dataset of millions
of points of interest is reprojected and each individual point of
interest record is updated in the database? Will the last revision date
in the metadata be updated for each point of interest? Moreover, moving
the metadata updating functionality to the database hides the
functionality from the user, which could result in unplanned side
effects (such as a cascading effect). How do you plan to address these
downsides of triggers in future?

Below a few detailed comments on individual sections of the paper:

1. Introduction There is no overview of the paper in the introduction
(to describe how different sections contribute to the stated objective)
and the sections also do not have an introductory paragraph to explain
how their content contributes to the paper’s objectives. This leaves the
author to read through individual sections in order to understand their
contribution. Either the overview paragraph or introductory text for
each section needs to be added.

3. Automating Metadata Creation 3.2 I suggest that you left-align the
first column of Table 1 for better presentation.

4. Implementing Metadata Creation in FOSS GIS Figure 4: Replace
‘Futures’ with ‘Future’

4.2.1 This subsection lists metadata details that are added. It is not
clear when they are added? When the data is imported? Or when an update
is made? The answer seems to be provided just before 4.3.1. This is too
far away.

The text at the beginning of 4.3, before 4.3.1, should be moved into its
own sub-section with an appropriate title.

4.3.1 It is not clear why the metadata language needs to be detected if
the user has the opportunity to change the dataset title, abstract and
lineage. Why don’t you just add a dropdown for the language onto the
dialog displayed in Figure 6? I agree that detecting the language of the
dataset itself is a relevant challenge.

5. Testing Metadata Automation The footnote on p16 is important enough
to be included in the main text.

For each keyword in the third column of Table 2, a figure is displayed
in brackets. It is not clear what this figure is. Since the value for
the fourth column is the same for all datasets, consider removing that
column (the information is provided in the text), which will allow more
space for the third column.

5.0.2 (p5) This subsection number is incorrect. Last paragraph:
‘…boundary extents and polygon for the dataset was updated…’ should be
‘…boundary extents and polygon for the dataset were updated…’

References There are a number of issues with the referencing, which need
to be addressed:

1. ISO standards cited in the text are not listed in the references.
They have to be added, e.g. ISO 19115:2003, Geographic Information --
Metadata. International Organization for Standardization, Geneva,
Switzerland.

2. References are cited incorrectly in some places, e.g. ‘Beyond these
basics, (Kalantari et al. 2010) have introduced…’ should be ‘Beyond
these basics, Kalantari et al. (2010) have introduced…’

3. There are a number of incomplete references in the list, e.g.
-	Ellul et al. (2012): Who is the publisher of the book? Editors of the
book? -	Batcheller (2008): The volume and issue numbers are missing.
-	Deng & Di (2009): The volume number is missing. -	Poore & Wolf
(2010: No source is provided for the article

Language The article is well-written and easy to follow. One small
issue: it is not clear why some words appear in quotes, e.g. ‘catalog’
(first paragraph, p5) and ‘properties’ (last paragraph, p5) are in
quotes. I suggest to remove the quotes.