<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>R | Daniel Antal</title><link>https://danielantal.eu/tag/r/</link><atom:link href="https://danielantal.eu/tag/r/index.xml" rel="self" type="application/rss+xml"/><description>R</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Tue, 22 Nov 2022 09:09:00 +0100</lastBuildDate><image><url>https://danielantal.eu/media/icon_hub9491570ac57158c0eeecc95c95b13e5_20247_512x512_fill_lanczos_center_3.png</url><title>R</title><link>https://danielantal.eu/tag/r/</link></image><item><title>Create Datasets that are Easy to Combine and Reuse</title><link>https://danielantal.eu/post/2022-12-02-dataset-on-cran/</link><pubDate>Tue, 22 Nov 2022 09:09:00 +0100</pubDate><guid>https://danielantal.eu/post/2022-12-02-dataset-on-cran/</guid><description>&lt;p>&lt;strong>The latest Reprex R package, dataset was released today on the Comprehensive R Archive Network. It is a very early, conceptual package that will help make scientific achievements more open, governmental data easier to find, and store information that can be better combined.&lt;/strong>&lt;/p>
&lt;p>Data interoperability is almost a buzzword, yet we see very few comprehensive, good solutions to apply it. Try to find information on open government portals or on big open science repositories—apart from a few good examples, most datasets are as disorganized as any PC’s hard disk that is collecting dust in a shed.&lt;/p>
&lt;p>The &lt;code>dataset&lt;/code> package aims to bring together the best practices of data semantics, data organization, and the use of standard metadata to make sure that whatever you store in a data table, it will be immediately available for data analysis, activation, or combination in any new database.&lt;/p>
&lt;p>Ambitious? It is, and &lt;code>dataset 0.1.9&lt;/code> is a very experimental product. While our other packages are aimed at intermediate users with a clear use case in mind, dataset at this point is aimed at package developers. Casual or even heavy R users are unlikely to download it as a standalone product. Instead, &lt;code>dataset&lt;/code> aims to be a stable developer basis for our existing products, rOpenGov packages, and many new uses.&lt;/p>
&lt;td style="text-align: center;">
&lt;figure id="figure-download-datasethttpsdatasetdataobservatoryeu">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Download [dataset](https://dataset.dataobservatory.eu/)" srcset="
/media/img/screenshots/dataset_0_1_9_hu0a73b7b10e7b08d2ea77dda52eaaa2b5_175803_7af70b7a68aa584fa4a40f2efedc9764.webp 400w,
/media/img/screenshots/dataset_0_1_9_hu0a73b7b10e7b08d2ea77dda52eaaa2b5_175803_995895f41cee25e4625b2ce9da9c1c88.webp 760w,
/media/img/screenshots/dataset_0_1_9_hu0a73b7b10e7b08d2ea77dda52eaaa2b5_175803_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://danielantal.eu/media/img/screenshots/dataset_0_1_9_hu0a73b7b10e7b08d2ea77dda52eaaa2b5_175803_7af70b7a68aa584fa4a40f2efedc9764.webp"
width="760"
height="428"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption>
Download &lt;a href="https://dataset.dataobservatory.eu/" target="_blank" rel="noopener">dataset&lt;/a>
&lt;/figcaption>&lt;/figure>&lt;/td>
&lt;p>The metadata aim of &lt;code>dataset&lt;/code> it to add standardized metadata to r data.frames, tibbles, data.tables and other similar structured, tabular objects. The organization and semantic objectives are to bring the tidy data concept closer to the datacube model, which is the basis of all statistical data exchanges, and W3C standards, which foster machine-to-machine data communications on the traditional web APIs and the semantic web.&lt;/p>
&lt;ol>
&lt;li>Makes data importing easier and less error-prone;&lt;/li>
&lt;li>Leaves plenty of room for documentation automation, resulting in far better reusability and reproducibility;&lt;/li>
&lt;li>The publication of results from R following the &lt;a href="https://www.go-fair.org/fair-principles/" target="_blank" rel="noopener">FAIR&lt;/a> principles is far easier, making the work of the R user more findable, more accessible, more interoperable and more reusable by other users;&lt;/li>
&lt;li>Makes the placement into relational databases, semantic web applications, archives, repositories possible without time-consuming and costly data wrangling (See &lt;a href="https://dataset.dataobservatory.eu/articles/RDF.html" target="_blank" rel="noopener">From dataset To RDF&lt;/a>).&lt;/li>
&lt;/ol>
&lt;p>The first official release offers little immediate benefits. However, if you are an R package developer, we can bring you a few steps nearer to releasing your data products in a way that conforms the &lt;a href="https://www.go-fair.org/fair-principles/" target="_blank" rel="noopener">FAIR metadata&lt;/a> principles. We can make a few steps to streamline your data wrangling. Make integration with relational databases easier. To make a step towards the semantic web.&lt;/p></description></item><item><title>Learn R with Reprex</title><link>https://danielantal.eu/slides/learn-with-reprex/</link><pubDate>Fri, 07 Oct 2022 12:35:00 +0200</pubDate><guid>https://danielantal.eu/slides/learn-with-reprex/</guid><description>&lt;h1 id="big-data-creates-inequalities">Big Data Creates Inequalities&lt;/h1>
&lt;p>Only the largest corporations, best-endowed universities, and rich governments can afford data collection and processing capacities that are large enough to harness the advantages of AI.&lt;/p>
&lt;hr>
&lt;h2 id="slide-navigation">Slide navigation&lt;/h2>
&lt;p>Fullscreen: &lt;code>F&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Next: &lt;code>️&amp;gt;&lt;/code> or &lt;code>Space&lt;/code> | Previous :️&lt;code>&amp;lt;&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>| Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click 🖱️&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="big-data-that-works-for-all">Big data that works for all&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">No matter how big is the problem or how small is your team, `Reprex` fill your reports, dashboards, newsletters, books with data and its visualization.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Learn R with us: you can reduce the inequalities by joining the open source movement, learning to run open source software, ask for help, improve the tutorials, the documentation, and eventually learn to make the computer work for you.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Contributor Covenant: Participating in open source is often a highly collaborative experience. We’re encouraged to create in public view, and we’re incentivized to welcome contributions of all kinds from people around the world. This makes the practice of open source as much social as it is technical.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="get-inspired">Get Inspired&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://curators.dataobservatory.eu/inspiration.html" target="_blank" rel="noopener">Find more interesting and better data&lt;/a>: you don&amp;rsquo;t have to be a data scientist or write code to contribute to our projects.&lt;/li>
&lt;li>&lt;a href="https://data-feminism.mitpress.mit.edu/" target="_blank" rel="noopener">Data feminism&lt;/a>: Catherine D&amp;rsquo;Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought. Highly inspirational, free, open-source book.&lt;/li>
&lt;li>&lt;a href="https://rladies.org/" target="_blank" rel="noopener">RLadies&lt;/a> is a world-wide organization to promote gender diversity in the R community.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="contributor-covenant">Contributor Covenant&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_1.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_2.webp"
>
&lt;hr>
&lt;h2 id="run-code-from-tutorials">Run code from tutorials&lt;/h2>
&lt;p>&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize.dataobservatory.eu&lt;/a>&lt;/br>
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/retroharmonize.htmll" target="_blank" rel="noopener">🖱 Get started&lt;/a>&lt;/br>
[🖱️ Articles](&lt;a href="https://retroharmonize.dataobservatory.eu/articles/index.htm" target="_blank" rel="noopener">https://retroharmonize.dataobservatory.eu/articles/index.htm&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_readme.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="github_issues_spotifyR.webp"
>
&lt;h2 id="find-help-ask-for-help-reprex">Find help, ask for help: reprex&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_tutorials.webp"
>
&lt;h2 id="documentation-for-better-tutorials">Documentation for better tutorials&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_testthat.webp"
>
&lt;h2 id="debugging-and-testing-code">Debugging and testing code&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_documentation.webp"
>
&lt;h2 id="contribute-to-documentation">Contribute to documentation&lt;/h2>
&lt;hr>
&lt;h2 id="r-is-a-functional-language">R is a functional language&lt;/h2>
&lt;ul>
&lt;li>R is both a statistical environment and a programming language&lt;/li>
&lt;li>R, the open source and further developed version of the S language, is mainly functional&lt;/li>
&lt;li>If you did a task at least twice, the 3rd time you better write a function script to keep doing it forever.&lt;/li>
&lt;li>Most of your effort will be to find a well-written function for your work&lt;/li>
&lt;li>If you cannot find a function, you will modify somebody else&amp;rsquo;s function, or eventually write your own&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_code.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="rmd_example.webp"
>
&lt;h2 id="r--yaml--markdown--web-ready">R + YAML + markdown = web ready&lt;/h2>
&lt;hr>
&lt;ul>
&lt;li>&lt;a href="https://learnxinyminutes.com/docs/yaml/" target="_blank" rel="noopener">Learn YAML in Y minutes&lt;/a>: tell the computer what you want to do with a document&lt;/li>
&lt;li>&lt;a href="https://rmarkdown.rstudio.com/authoring_basics.html" target="_blank" rel="noopener">R Markdown basics&lt;/a>: it is just a plain markdown that allows you to insert little R program &amp;lsquo;chunks&amp;rsquo;.&lt;/li>
&lt;li>&lt;a href="https://github.com/mundimark/awesome-markdown-editors" target="_blank" rel="noopener">Awesome markdown editors and pre-writers&lt;/a>: find a convenient tool&lt;/li>
&lt;li>&lt;a href="https://workspace.google.com/marketplace/app/docs_to_markdown/700168918607" target="_blank" rel="noopener">Google Docs to markdown&lt;/a>: practice by translating your Google Docs text to markdown. It is &lt;em>very&lt;/em> easy.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_website.webp"
>
&lt;h2 id="package-and-release-a-team-effort">Package and release: a team effort&lt;/h2>
&lt;hr>
&lt;h2 id="our-open-source-development-projects">Our open source development projects&lt;/h2>
&lt;p>🔢 &lt;a href="https://dataset.dataobservatory.eu/" target="_blank" rel="noopener">dataset&lt;/a>: Synchronize datasets with global knowledge hubs #️⃣ &lt;a href="https://statcodelists.dataobservatory.eu/" target="_blank" rel="noopener">statcodelists&lt;/a>: Make your data codes understood globally ♻️ &lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a>: Create economic or environmental impact assessments in any EU country 🌍 &lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions&lt;/a>: Create from raw survey data more granular statistics in any EU country ✅ &lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a>: Harmonize questions banks, recycle answers from past surveys ⏭️ &lt;a href="https://reprex.nl/#releases" target="_blank" rel="noopener">all in on one page&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="create_with_reprex.webp"
>
&lt;h2 id="create-with-us">Create with us&lt;/h2>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a> | &lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Home&lt;/a>&lt;/p></description></item><item><title>Learn R with Reprex</title><link>https://danielantal.eu/slides/learnr-with-reprex/</link><pubDate>Fri, 07 Oct 2022 12:35:00 +0200</pubDate><guid>https://danielantal.eu/slides/learnr-with-reprex/</guid><description>&lt;h1 id="big-data-creates-inequalities">Big Data Creates Inequalities&lt;/h1>
&lt;p>Only the largest corporations, best-endowed universities, and rich governments can afford data collection and processing capacities that are large enough to harness the advantages of AI.&lt;/p>
&lt;hr>
&lt;h2 id="slide-navigation">Slide navigation&lt;/h2>
&lt;p>Fullscreen: &lt;code>F&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Next: &lt;code>️&amp;gt;&lt;/code> or &lt;code>Space&lt;/code> | Previous :️&lt;code>&amp;lt;&lt;/code>&lt;/li>
&lt;li>Start: &lt;code>Home&lt;/code> | Finish: &lt;code>End&lt;/code>&lt;/li>
&lt;li>Overview: &lt;code>Esc&lt;/code>| Speaker notes: &lt;code>S&lt;/code>&lt;/li>
&lt;li>Zoom: &lt;code>Alt + Click 🖱️&lt;/code>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="big-data-that-works-for-all">Big data that works for all&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">No matter how big is the problem or how small is your team, `Reprex` fill your reports, dashboards, newsletters, books with data and its visualization.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Learn R with us: you can reduce the inequalities by joining the open source movement, learning to run open source software, ask for help, improve the tutorials, the documentation, and eventually learn to make the computer work for you.
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">Contributor Covenant: Participating in open source is often a highly collaborative experience. We’re encouraged to create in public view, and we’re incentivized to welcome contributions of all kinds from people around the world. This makes the practice of open source as much social as it is technical.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="data-feminism">Data Feminism&lt;/h2>
&lt;hr>
&lt;h2 id="get-inspired">Get Inspired&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://curators.dataobservatory.eu/inspiration.html" target="_blank" rel="noopener">Find more interesting and better data&lt;/a>: you don&amp;rsquo;t have to be a data scientist or write code to contribute to our projects.&lt;/li>
&lt;li>&lt;a href="https://data-feminism.mitpress.mit.edu/" target="_blank" rel="noopener">Data feminism&lt;/a>: Catherine D&amp;rsquo;Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought. Highly inspirational, free, open-source book.&lt;/li>
&lt;li>&lt;a href="https://rladies.org/" target="_blank" rel="noopener">RLadies&lt;/a> is a world-wide organization to promote gender diversity in the R community.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="contributor-covenant">Contributor Covenant&lt;/h2>
&lt;ul>
&lt;li>
&lt;p style="font-size:75%">We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.&lt;/p>
&lt;/li>
&lt;li>
&lt;p style="font-size:75%">We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_1.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_example_2.webp"
>
&lt;hr>
&lt;h2 id="run-code-from-tutorials">Run code from tutorials&lt;/h2>
&lt;p>&lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize.dataobservatory.eu&lt;/a>&lt;/br>
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/retroharmonize.htmll" target="_blank" rel="noopener">🖱 Get started&lt;/a>&lt;/br>
&lt;a href="https://retroharmonize.dataobservatory.eu/articles/index.html" target="_blank" rel="noopener">🖱️ Articles&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_readme.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="github_issues_spotifyR.webp"
>
&lt;h2 id="find-help-ask-for-help-reprex">Find help, ask for help: reprex&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_tutorials.webp"
>
&lt;h2 id="documentation-for-better-tutorials">Documentation for better tutorials&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_testthat.webp"
>
&lt;h2 id="debugging-and-testing-code">Debugging and testing code&lt;/h2>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_documentation.webp"
>
&lt;h2 id="contribute-to-documentation">Contribute to documentation&lt;/h2>
&lt;hr>
&lt;h2 id="r-is-a-functional-language">R is a functional language&lt;/h2>
&lt;ul>
&lt;li>R is both a statistical environment and a programming language&lt;/li>
&lt;li>R, the open source and further developed version of the S language, is mainly functional&lt;/li>
&lt;li>If you did a task at least twice, the 3rd time you better write a function script to keep doing it forever.&lt;/li>
&lt;li>Most of your effort will be to find a well-written function for your work&lt;/li>
&lt;li>If you cannot find a function, you will modify somebody else&amp;rsquo;s function, or eventually write your own&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_r_code.webp"
>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="rmd_example.webp"
>
&lt;h2 id="r--yaml--markdown--web-ready">R + YAML + markdown = web ready&lt;/h2>
&lt;hr>
&lt;ul>
&lt;li>&lt;a href="https://learnxinyminutes.com/docs/yaml/" target="_blank" rel="noopener">Learn YAML in Y minutes&lt;/a>: tell the computer what you want to do with a document&lt;/li>
&lt;li>&lt;a href="https://rmarkdown.rstudio.com/authoring_basics.html" target="_blank" rel="noopener">R Markdown basics&lt;/a>: it is just a plain markdown that allows you to insert little R program &amp;lsquo;chunks&amp;rsquo;.&lt;/li>
&lt;li>&lt;a href="https://github.com/mundimark/awesome-markdown-editors" target="_blank" rel="noopener">Awesome markdown editors and pre-writers&lt;/a>: find a convenient tool&lt;/li>
&lt;li>&lt;a href="https://workspace.google.com/marketplace/app/docs_to_markdown/700168918607" target="_blank" rel="noopener">Google Docs to markdown&lt;/a>: practice by translating your Google Docs text to markdown. It is &lt;em>very&lt;/em> easy.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="retroharmonize_website.webp"
>
&lt;h2 id="package-and-release-a-team-effort">Package and release: a team effort&lt;/h2>
&lt;hr>
&lt;h2 id="our-open-source-development-projects">Our open source development projects&lt;/h2>
&lt;p>🔢 &lt;a href="https://dataset.dataobservatory.eu/" target="_blank" rel="noopener">dataset&lt;/a>: Synchronize datasets with global knowledge hubs #️⃣ &lt;a href="https://statcodelists.dataobservatory.eu/" target="_blank" rel="noopener">statcodelists&lt;/a>: Make your data codes understood globally ♻️ &lt;a href="https://iotables.dataobservatory.eu/" target="_blank" rel="noopener">iotables&lt;/a>: Create economic or environmental impact assessments in any EU country 🌍 &lt;a href="https://regions.dataobservatory.eu/" target="_blank" rel="noopener">regions&lt;/a>: Create from raw survey data more granular statistics in any EU country ✅ &lt;a href="https://retroharmonize.dataobservatory.eu/" target="_blank" rel="noopener">retroharmonize&lt;/a>: Harmonize questions banks, recycle answers from past surveys ⏭️ &lt;a href="https://reprex.nl/#releases" target="_blank" rel="noopener">all in on one page&lt;/a>&lt;/p>
&lt;hr>
&lt;section data-noprocess data-shortcode-slide
data-background-image="create_with_reprex.webp"
>
&lt;h2 id="create-with-us">Create with us&lt;/h2>
&lt;hr>
&lt;h1 id="questions">Questions?&lt;/h1>
&lt;p>&lt;a href="https://reprex.nl/#contact" target="_blank" rel="noopener">Email&lt;/a> | &lt;a href="https://keybase.io/team/reprexcommunity" target="_blank" rel="noopener">Keybase&lt;/a>&lt;/p>
&lt;p>LinkedIn: &lt;a href="https://www.linkedin.com/in/antaldaniel/" target="_blank" rel="noopener">Daniel Antal&lt;/a> - &lt;a href="https://www.linkedin.com/company/68855596" target="_blank" rel="noopener">Reprex&lt;/a> | &lt;a href="https://reprex.nl/" target="_blank" rel="noopener">Home&lt;/a>&lt;/p></description></item><item><title>Creating Algorithmic Tools to Interpret and Communicate Open Data Efficiently</title><link>https://danielantal.eu/post/2021-06-04-developer-leo-lahti/</link><pubDate>Fri, 04 Jun 2021 10:00:00 +0000</pubDate><guid>https://danielantal.eu/post/2021-06-04-developer-leo-lahti/</guid><description>&lt;p>&lt;strong>As a developer at rOpenGov, what type of data do you usually use in your work?&lt;/strong>&lt;/p>
&lt;p>As an academic data scientist whose research focuses on the development of general-purpose algorithmic methods, I work with a range of applications from life sciences to humanities. Population studies play a big role in our research, and often the information that we can draw from public sources - geospatial, demographic, environmental - provides invaluable support. We typically use open data in combination with sensitive research data but some of the research questions can be readily addressed based on open data from statistical authorities such as Statistics Finland or Eurostat.&lt;/p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://danielantal.eu/img/partners/rOpenGov-intro.png" alt="" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;p>&lt;strong>In your ideal data world, what would be the ultimate dataset, or datasets that you would like to see in the Music Data Observatory?&lt;/strong>&lt;/p>
&lt;p>One line of our research analyses the historical trends and spread of knowledge production, in particular book printing based on large-scale metadata collections. It would be interesting to extend this research to music, to understand the contemporary trends as well as the broader historical developments. Gaining access to a large systematic collection of music and composition data from different countries across long periods of time would make this possible.&lt;/p>
&lt;p>&lt;strong>Why did you decide to join the challenge and why do you think that this would be a game changer for researchers and policymakers?&lt;/strong>&lt;/p>
&lt;p>Joining the challenge was a natural development based on our overall activities in this area; &lt;a href="http://ropengov.org/community/" target="_blank" rel="noopener">the rOpenGov project&lt;/a> has been around for a decade now, since the early days of the broader open data movement. This has also created an active international developer network and we felt well equipped for picking up the challenge. The game changer for researchers is that the project highlights the importance of data quality, even when dealing with official statistics, and provides new methods to solve these issues efficiently through the open collaboration model. For policymakers, this provides access to new high-quality curated data and case studies that can support evidence-based decision-making.&lt;/p>
&lt;p>&lt;strong>Do you have a favorite, or most used open governmental or open science data source? What do you think about it? Could it be improved?&lt;/strong>&lt;/p>
&lt;p>Regarding open government data, one of my favorites is not a single data source but a data representation standard. The &lt;a href="https://www.scb.se/en/services/statistical-programs-for-px-files/#:~:text=PX%20is%20a%20standard%20format,and%20data." target="_blank" rel="noopener">px format&lt;/a> is widely used by statistical authorities in various countries, and this has allowed us to create R tools that allow the retrieval and analysis of official statistics from many countries across Europe, spanning dozens of statistical institutions. Standardization of open data formats allows us to build robust algorithmic tools for downstream data analysis and visualization. Open government data is still too often shared in obscure, non-standard or closed-source file formats and this is creating significant bottlenecks for the development of scalable and interoperable AI and machine learning methods that can harness the full potential of open data.&lt;/p>
&lt;figure id="figure-regarding-open-government-data-one-of-my-favorites-is-not-a-single-data-source-but-a-data-representation-standard-the-px-format">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://danielantal.eu/img/developers/PxWeb.png" alt="Regarding open government data, one of my favorites is not a single data source but a data representation standard, the Px format." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Regarding open government data, one of my favorites is not a single data source but a data representation standard, the Px format.
&lt;/figcaption>&lt;/figure>
&lt;p>&lt;strong>From your perspective, what do you see being the greatest problem with open data in 2021?&lt;/strong>&lt;/p>
&lt;p>Although there are a variety of open data sources available (and the numbers continue to increase), the availability of open algorithmic tools to interpret and communicate open data efficiently is lagging behind. One of the greatest challenges for open data in 2021 is to demonstrate how we can maximize the potential of open data by designing smart tools for open data analytics.&lt;/p>
&lt;p>&lt;strong>What can our automated data observatories do to make open data more credible in the European economic policy community and be accepted as verified information?&lt;/strong>&lt;/p>
&lt;p>The role of the professional network backing up the project, and the possibility of getting critical feedback and later adoption by the academic communities will support the efforts. Transparency of the data harmonization operations is the key to credibility, and will be further supported by concrete benchmarks that highlight the critical differences in drawing conclusions based on original sources versus the harmonized high-quality data sets.&lt;/p>
&lt;figure id="figure-we-need-to-get-critical-feedback-and-later-adoption-by-the-academic-communities">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://danielantal.eu/img/observatory_screenshots/greendeal_and_zenodo.png" alt="We need to get critical feedback and later adoption by the academic communities." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
We need to get critical feedback and later adoption by the academic communities.
&lt;/figcaption>&lt;/figure>
&lt;p>&lt;strong>How we can ensure the long-term sustainability of the efforts?&lt;/strong>&lt;/p>
&lt;p>The extent of open data space is such that no single individual or institution can address all the emerging needs in this area. The open developer networks play a huge role in the development of algorithmic methods, and strong communities have developed around specific open data analytical environments such as R, Python, and Julia. These communities support networked collaboration and provide services such as software peer review. The long-term sustainability will depend on the support that such developer communities can receive, both from individual contributors as well as from institutions and governments.&lt;/p>
&lt;figure id="figure-join-our-open-collaboration-economy-data-observatory-team-as-a-data-curatorauthorscurator-developerauthorsdeveloper-or-business-developerauthorsteam-or-share-your-data-in-our-public-repository-economy-data-observatory-on-zenodohttpszenodoorgcommunitieseconomy_observatory">
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://danielantal.eu/img/observatory_screenshots/edo_and_zenodo.png" alt="Join our open collaboration Economy Data Observatory team as a [data curator](/authors/curator), [developer](/authors/developer) or [business developer](/authors/team), or share your data in our public repository [Economy Data Observatory on Zenodo](https://zenodo.org/communities/economy_observatory/)" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;figcaption data-pre="Figure&amp;nbsp;" data-post=":&amp;nbsp;" class="numbered">
Join our open collaboration Economy Data Observatory team as a &lt;a href="https://danielantal.eu/authors/curator">data curator&lt;/a>, &lt;a href="https://danielantal.eu/authors/developer">developer&lt;/a> or &lt;a href="https://danielantal.eu/authors/team">business developer&lt;/a>, or share your data in our public repository &lt;a href="https://zenodo.org/communities/economy_observatory/" target="_blank" rel="noopener">Economy Data Observatory on Zenodo&lt;/a>
&lt;/figcaption>&lt;/figure>
&lt;h2 id="join-us">Join us&lt;/h2>
&lt;p>&lt;em>Join our open collaboration Economy Data Observatory team as a &lt;a href="https://danielantal.eu/authors/curator">data curator&lt;/a>, &lt;a href="https://danielantal.eu/authors/developer">developer&lt;/a> or &lt;a href="https://danielantal.eu/authors/team">business developer&lt;/a>. More interested in environmental impact analysis? Try our &lt;a href="https://greendeal.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Green Deal Data Observatory&lt;/a> team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href="https://music.dataobservatory.eu/#contributors" target="_blank" rel="noopener">Digital Music Observatory&lt;/a> team!&lt;/em>&lt;/p></description></item></channel></rss>