library(sf)
11 Data Sets
11.1 Madrid AirBnb
This dataset contains a pre-processed set of properties advertised on the AirBnb website within the region of Madrid (Spain), together with house characteristics.
Availability
The dataset is stored on a Geopackage that can be found, within the structure of this project, under:
<- "data/assignment_1_madrid/madrid_abb.gpkg" path
<- st_read(path) db
Reading layer `madrid_abb' from data source
`/Users/franciscorowe/Dropbox/Francisco/uol/teaching/envs453/202223/san/data/assignment_1_madrid/madrid_abb.gpkg'
using driver `GPKG'
Simple feature collection with 18399 features and 16 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -3.86391 ymin: 40.33243 xmax: -3.556 ymax: 40.56274
Geodetic CRS: WGS 84
Variables
For each of the 17 properties, the following characteristics are available:
price
: [string] Price with currencyprice_usd
: [int] Price expressed in USDlog1pm_price_usd
: [float] Log of the priceaccommodates
: [integer] Number of people the property accommodatesbathrooms
: [integer] Number of bathroomsbedrooms
: [integer] Number of bedroomsbeds
: [integer] Number of bedsneighbourhood
: [string] Name of the neighbourhood the property is located inroom_type
: [string] Type of room offered (shared, private, entire home, hotel room)property_type
: [string] Type of property advertised (apartment, house, hut, etc.)WiFi
: [binary] Takes1
if the property has WiFi,0
otherwiseCoffee
: [binary] Takes1
if the property has a coffee maker,0
otherwiseGym
: [binary] Takes1
if the property has access to a gym,0
otherwiseParking
: [binary] Takes1
if the property offers parking,0
otherwisekm_to_retiro
: [float] Euclidean distance from the property to the El Retiro parkgeom
: [geometry] Point geometry
Projection
The location of each property is stored as point geometries and expressed in longitude and latitude coordinates:
st_crs(db)
Coordinate Reference System:
User input: WGS 84
wkt:
GEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
Source & Pre-processing
The data are sourced from Inside Airbnb. A Jupyter notebook in Python (available at data/assignment_1_madrid/clean_data.ipynb
) details the process from the original file available from source to the data in madrid_abb.gpkg
.
11.2 England COVID-19
This dataset contains:
daily COVID-19 confirmed cases from 1st January, 2020 to 2nd February, 2021 from the GOV.UK dashboard;
resident population characteristics from the 2011 census, available from the Office of National Statistics; and,
2019 Index of Multiple Deprivation (IMD) data from GOV.UK and published by the Ministry of Housing, Communities & Local Government.
The data are at the Upper Tier Local Authority District (UTLAD) level - also known as Counties and Unitary Authorities.
Availability
The dataset is stored on a Geopackage:
<- st_read("data/assignment_2_covid/covid19_eng.gpkg") sdf
Reading layer `covid19_eng' from data source
`/Users/franciscorowe/Dropbox/Francisco/uol/teaching/envs453/202223/san/data/assignment_2_covid/covid19_eng.gpkg'
using driver `GPKG'
Simple feature collection with 149 features and 507 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 134112.4 ymin: 11429.67 xmax: 655653.8 ymax: 657536
Projected CRS: OSGB36 / British National Grid
Variables
The data set contains 508 variables:
objectid
: [integer] unit identifierctyua19cd
: [integer] Upper Tier Local Authority District (or Counties and Unitary Authorities) identifierctyua19nm
: [character] Upper Tier Local Authority District (or Counties and Unitary Authorities) nameRegion
: [character] Region namelong
: [numeric] longitudelat
: [numeric] latitudest_areasha
: [numeric] area in hectareX2020.01.31
toX2021.02.05
: [numeric] Daily COVID-19 cases from 31st January, 2020 to 5th February, 2021IMD...Average.score
-IMD.2019...Local.concentration
: [numeric] IMD indicators - for details see File 11: upper-tier local authority summaries.Residents
: [numeric] Total resident populationHouseholds
: [numeric] Total householdsDwellings
: [numeric] Total dwellingsHousehold_Spaces
: [numeric] Total household spacesAged_16plus
toOther_industry
: [numeric] comprise 114 variables relating to various population and household attributes of the resident population. A description of all these variables can be found heregeom
: [geometry] Point geometry
Projection
Details of the coordinate reference system:
st_crs(sdf)
Coordinate Reference System:
User input: OSGB36 / British National Grid
wkt:
PROJCRS["OSGB36 / British National Grid",
BASEGEOGCRS["OSGB36",
DATUM["Ordnance Survey of Great Britain 1936",
ELLIPSOID["Airy 1830",6377563.396,299.3249646,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4277]],
CONVERSION["British National Grid",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",49,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",-2,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996012717,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",400000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",-100000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Engineering survey, topographic mapping."],
AREA["United Kingdom (UK) - offshore to boundary of UKCS within 49°45'N to 61°N and 9°W to 2°E; onshore Great Britain (England, Wales and Scotland). Isle of Man onshore."],
BBOX[49.75,-9,61.01,2.01]],
ID["EPSG",27700]]