An Executable Conceptual Model for Biodiversity in Planning
- Chris Wallace
- Ex University of the West of England, Bristol
- Structured Programming -> Object Orientation -> XML and Functional
Programming
- Early eXist-db adopter
- Joint founder of the XQuery wikibook
- Now a tree hugger, datasmith and developer with the Bristol Tree Forum
Tree Huggers
in 1730, 294 Indian men and 69 women died while protecting their
village trees.
Members of the Bishnois branch of Hinduism, died while trying to
protect the trees in their village from being turned into the raw material for
building a palace. They literally clung to the trees, while being slaughtered by the
foresters. But their action led to a royal decree prohibiting the cutting of trees
in any Bishnoi village. And now those villages are virtual wooded oases amidst an
otherwise desert landscape.
Photo of village women of the Chipko movement (chipko means “to
cling” in Hindi)) in the early 70's in the Garhwal Hills of India, protecting the
trees from being cut down.
Contents
- Biodiversity Net Gain in UK Planning law
- Spreadsheet - Blessing or Curse?
- Data models and UML's OCL
- EARX : An Executable EAR model using XQuery
- Dependencies and computation
- The importance of Algorithmic Transparency
Biodiversity Net Gain : BNG
They paved paradise
And put up a parking lot
- Biodiversity Net Gain (BNG) aims to improve biodiversity after development.
- BNG will be implemented in UK Planning Law with a 10% net gain required.
- How will it be measured?
- Will the assessment be open to public scrutiny?
BNG Metric
- Site is divided into 'parcels' of land with the same habitat type and
quality
- The value of each parcel pre- and post-development is converted to a common
currency of Habitat Units (HU) Concentrating here on area
habitiats but Hedgerows and watercourse are included
- Development is required to provide a net 10% increase in HU via
a mitigation hierarchy: on-site ; off-site ; purchased credits Of-site could be anywhere in Britain at a increased cost in Units which
potentially sees urban biodiversity exported to farming land. If you
can't acquire any habitat at all, you would be required to pay the
government eg £84,000 one unit of grassland
- Complex rules determine how parcels are assessed and what character of existing
land can be replaced by what character of proposed land of what timescale.
BNG Metric Computation
- An Excel spreadsheet has been developed for public use
by Natural England, a part of the UK department DEFRA
- There is no full definition of the Metric other than its implementation in the
spreadsheet indeed no distinction seems to be made in
the legislation
- The spreadsheet contains about 40 sheets which together contain roughly 95
tables
- It has been developed by end-users, mainly ecologists.
- "This is not software development, we are just using a existing
program"
Spreadsheets - Blessing or Curse?
- It's the most widespread
'declarative' computational tool I've seen various
estimates of around a billion people use Excel
- Most spreadsheets in use contain errors, sometimes very serious
Each of us has
their own particular horror story. My personal vote goes to the infamous
Reinhart and Roggoff paper "Growth in a time of Debt". A basic error in
their spreadsheet coupled with bad and opaque methodology led to support for
austerity programmes across the world.
BBC nes
- No separation of data and code Apart from making it hard to comprehend, it makes testing
and version control very hard
- No abstractionThe core Habitat table contains 132 rows or 133 or 131,
all conceptually the same
- Prototyping is much easier than
abstraction I personally find it much easier to create
Schemas from an instance of the documet than de novo
- Incomprehensible cell references
Excel does have a facility to name cells or sequences
of cells but it is rarely used
- Hidden cells and macros
Excel : Derivation
Computing the Habitat Units of a parcel
=IFERROR(IF(P11="","",
IF(AND(P11=$'G-1 All Habitats'.$X$3,T11>0),U11+V11,
IF(AND(P11=$'G-1 All Habitats'.$X$3,S11>0),U11+V11,
IF(AND(P11=$'G-1 All Habitats'.$X$3,AC11>0),"Any Loss Unacceptable ⚠",
IF(AND(P11=$'G-1 All Habitats'.$X$3,W11>0),"Any Loss Unacceptable ⚠",
H11*J11*L11*O11)
)))),"Check Data ▲")
Excel : Table check
Fairly conditions are those need further evidence
=IF(OR(K11:K258="Fairly Good",K11:K258="Fairly Poor"),
"A 'Fairly Category has been used -
check evidence to ensure this is appropriate ",
"")
One of the summary cells in the Habitat_Baseline table
The challenge
Is there an alternative to this spreadsheet which would:
- Be a transparent implementation of the rules of this metric
- Provide tracebility to documentation and legislation
- Provide assistance to users
- Be executable
- Support testing
Complexity
There are two ways of constructing a software design: One way is to make it so
simple that there are obviously no deficiencies, and the other way is to make it
so complicated that there are no obvious deficiencies. The first method is far
more difficult.Tony Hoare 1980
Apart from Quicksort, CSP and Z, Tony implemented Algol 60 for
Elliott Brothers which I used on the NZ Met Office's Elliot 503 and was a close
friend of my mentor and employer for a few years, Michael Jackson.
Legislators and administrators would do well to heed this insight
too.
Data Modeling
- eg ER models by Peter Chen
- Terminology : intensional(conceptual) and extensional(data)
- Data model and its extension as one or more Databases (Project)
- Entity and its extension as a Table in a Database This terminology unconventional since in Chen's work theis would be
an Entity Type, and the instance an Entity.
- Attribute and its extension as values in a Table
- Relationship between Entities- as shared values, pointers
Object modeling
- eg UML, Syntopy
- Diagrams of Entities and Relationships +
- Inheritance
- Localized behaviour in response to messages
- Object Constraint Language (OCL), an algebraic expression of invariants
and pre- and post-conditions
lightgreen
Part of the main BNG data model: Bristol Zoo
Boxes are Entities immutable user input summary [nnn] is the
cardinality of the Tables Lines are dependencies
EARX : Simplifying assumptions
- All data, raw or computed, is represented by Entities
- Attributes are simple and single-valued
- Computation restricted to the pre-condition (validation) and post-conditions
(derivation) of attributes (and Entities)
- Strictly hierachical dependencies (visibility):
- no cycles in dependencies between Entities
- no dependence between instances of the same entity
EARX : Executable Data Model
- XQuery is used for the derivation of computed values and validation rules, using
eval()
- The model is an XML document eg BNG4.0 model
- Each table is an XML document eg Baseline HabitatI've chosen to use first-class entity-names and attribute
names. An alternative would be a generic Row element with a name attribute
containing Field elements with a name attribute. This would allow indexes to
be used but that would make the expression of the formula unreadable
- Online application written in XQuery running on eXist-db and
JavaScript.
XQuery : Condition Lookup
table('Habitat_Condition')
[Habitat=$self/Habitat]
/Condition
- table('Habitat_Condition') is replaced by a call to get the instances
of the Table.
- $self is the current instance
- Here the result is multi-valued so the UI creates a
selector if there are multiple values
- In this case the table is common to all Tree Survey Projects
and resides in a common Project.
XQuery : Calculate Habitat Units
$self/Area
* $self/Distinctiveness_Score
* $self/Habitat_Condition_Score
* $self/Strategic_significance_multiplier
* $self/Spatial_Risk_Multiplier
- Derivation separated from checking pre-conditions
- Missing values are undefined so the calculation is also undefined
XQuery : Fairly condition
<attribute name="Fairly_Check" type="string">
<description>see User Guide 5.3.6</description>
<derivation>
<code>
if (some $condition in table('Habitat_Baseline')/Habitat_Condition
satisfies starts-with($condition,"Fairly"))
then "Warning - ..."
else "OK - ..."
</code>
</derivation>
</attribute>
Where to place this ?
Challenges and Opportunities
- Multi-dimensional tables, with rows providing one dimension and nested columns
the others.
- Restriction in place of validation
- Summary and Pivot tables Random fact : Microsoft held a
trademark on the term til 2020
- Percentage calculations?
- Views - stored or cached?
- Invariants: HU = f(Area) ? Area = f-1(HU)
Computation
The core feature of a
spreadsheet is the
computation algorithm.
- repeatedly recalculate top down, left to right until nothing changes (Visicalc
1979)
- construct a dependency graph to determine order of cell evaluation, recalc all
(LANPAR 1969 Forward Referencing/Natural Order Calculation)
- recalc only the affected cells on change (Modern spreadsheets?)
EARX : computing an Entity Instance
Topological sorting
declare function md:topological-sort(
$unordered as element(dependency)*,
$ordered as as element(dependency)* )
as element(dependency)* {
if (empty($unordered))
then $ordered
else
let $dependencies :=
$unordered [every $name in require
satisfies $name = $ordered/@name]
return
if ($dependencies)
then md:topological-sort(
$unordered except $dependencies,
($ordered, $dependencies )
)
else ...
};
This algorithm needs a workaround in eXist-db due to a bug
When looking for an algorithm for topological sorting, I asked
Chat-GTP (Bard) for an algorithm in XQuery. It came back with syntacticlly correct
but not functional code, even after several challenges. Giving up on AI, I did a
basic search and the first result was an algorithm I had written in the
XQuery
wikibook 15 years ago.
Development Work to do
The aim is to construct an executable, informative specification more
than a production tool.
- Complete the BNG model - co-development with schema and application
- Test with other applications eg Tree Survey
- Developing the UI from simplistic, single tables
- Automatic entity recalculation is slow
- MDD - Explore code generation from a model
- Pretty-print XQuery code
Algorithmic Transparency
- AI has prompted the need for transparency in the use of algorithms in
government decision-making
- Particular concerns around bias when decisions affect individuals - eg benefit
applications
- UK standard for a register was developed by the UK Government.
- After two years, the register has only 6 entries.
- Planning decisions affect large groups of people indirectly
- The BNG spreadsheet would be counted as 'transparent'
Legislation and Climate change
- Climate change is real
- Our personal and governmental response is conficted by short-term concerns
- The current UK government is rowing back on climate change. The mandatory use of
BNG has been delayed
- There is a power and knowledge imbalance between developers and citizens
- Models which are complex and opaque favour developers
- Citizens deserve open data and transparent algorithms
- Declarative models such as EARX might help
They took all the trees
Put 'em in a tree museum