<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1646-9895</journal-id>
<journal-title><![CDATA[RISTI - Revista Ibérica de Sistemas e Tecnologias de Informação]]></journal-title>
<abbrev-journal-title><![CDATA[RISTI]]></abbrev-journal-title>
<issn>1646-9895</issn>
<publisher>
<publisher-name><![CDATA[AISTI - Associação Ibérica de Sistemas e Tecnologias de Informação]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1646-98952022000200054</article-id>
<article-id pub-id-type="doi">10.17013/risti.46.54-70</article-id>
<title-group>
<article-title xml:lang="pt"><![CDATA[Requisitos para a ciência de dados: analisando anúncios de vagas de emprego com mineração de texto]]></article-title>
<article-title xml:lang="en"><![CDATA[Requirements for Data Science: analyzing job postings with text mining]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Guimarães]]></surname>
<given-names><![CDATA[André José Ribeiro]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Mendes Júnior]]></surname>
<given-names><![CDATA[Ricardo]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Freitas]]></surname>
<given-names><![CDATA[Maria do Carmo Duarte]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Universidade Federal do Paraná  ]]></institution>
<addr-line><![CDATA[Curitiba ]]></addr-line>
<country>Brazil</country>
</aff>
<pub-date pub-type="pub">
<day>30</day>
<month>06</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="epub">
<day>30</day>
<month>06</month>
<year>2022</year>
</pub-date>
<numero>46</numero>
<fpage>54</fpage>
<lpage>70</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://scielo.pt/scielo.php?script=sci_arttext&amp;pid=S1646-98952022000200054&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.pt/scielo.php?script=sci_abstract&amp;pid=S1646-98952022000200054&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.pt/scielo.php?script=sci_pdf&amp;pid=S1646-98952022000200054&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="pt"><p><![CDATA[Resumo Esta pesquisa identifica os requisitos para cientistas de dados no Brasil em anúncios de emprego. Para analisar estes documentos, adota métodos de mineração de texto: n-grama, modelagem de tópico e agrupamento. Os resultados apontam uma concentração de vagas em São Paulo e revelam que a modalidade remota é a segunda mais ofertada. Além disso, destaca que os salários no Brasil estão abaixo da média de outros países, mesmo que as organizações procurem por profissionais experientes e com alto nível educacional. Quanto aos requisitos, há o predomínio de habilidades técnicas como machine learning, modelos estatísticos, python, banco de dados, dentre outras. Para as técnicas de mineração, demonstra que n-grama e o agrupamento são mais adequadas que a modelagem de tópicos.]]></p></abstract>
<abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract This research identifies in job postings the requirements for data scientists in Brazil. To analyze these documents, it adopts text mining methods of analysis: n-gram, topic modeling, and clustering. The findings point to a concentration of job opportunities in São Paulo while demonstrating that the remote modality is the second most offered. Additionally, it highlights that salaries in Brazil are below the average of other countries, even if organizations look for experienced professionals with an elevated level of education. About the requirements, there is a predominance of technical skills such as machine learning, statistical models, python, and database, among others. The results also demonstrate that n-gram and clustering are more suitable for text mining techniques than topic modeling.]]></p></abstract>
<kwd-group>
<kwd lng="pt"><![CDATA[Cientista de dados]]></kwd>
<kwd lng="pt"><![CDATA[Mineração de texto]]></kwd>
<kwd lng="pt"><![CDATA[Requisitos para cientista de dados]]></kwd>
<kwd lng="pt"><![CDATA[Competências]]></kwd>
<kwd lng="en"><![CDATA[Data scientist]]></kwd>
<kwd lng="en"><![CDATA[Text mining]]></kwd>
<kwd lng="en"><![CDATA[Requirements for data scientist]]></kwd>
<kwd lng="en"><![CDATA[Competencies]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Anderson]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[McGuffee]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Uminsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Data science as an undergraduate degree]]></article-title>
<source><![CDATA[SIGCSE 2014 - Proceedings of the 45th ACM Technical Symposium on Computer Science Education]]></source>
<year>2014</year>
<page-range>705-6</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Anthony]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<source><![CDATA[AntConc (Version 4.0.4) [Computer Software]]]></source>
<year>2022</year>
</nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ba&#353;karada]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Koronios]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Unicorn data scientist: the rarest of breeds]]></article-title>
<source><![CDATA[Program]]></source>
<year>2017</year>
<volume>51</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>65-74</page-range></nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Baumeister]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Barbosa]]></surname>
<given-names><![CDATA[M. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Gomes]]></surname>
<given-names><![CDATA[R. R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[What is required to be a data scientist? Analyzing job descriptions with centering resonance analysis. International Journal of Human Capital and Information]]></article-title>
<source><![CDATA[Technology Professionals]]></source>
<year>2020</year>
<volume>11</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>21-40</page-range></nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bedregal-Alpaca]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Aruquipa-Velazco]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Cornejo-Aparicio]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Técnicas de Data Mining para extraer perfiles comportamiento académico y predecir la deserción universitaria]]></article-title>
<source><![CDATA[Revista Ibérica de Sistemas e Tecnologias de Informação]]></source>
<year>2020</year>
<numero>E27</numero>
<issue>E27</issue>
<page-range>592-604</page-range></nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bengfort]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Bilbro]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Ojeda]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Applied Text analysis with Python: Enabling Language-Aware Data Products with Machine Learning]]></source>
<year>2018</year>
<publisher-name><![CDATA[O&#8217;Reilly Media, Inc]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Brandt]]></surname>
<given-names><![CDATA[P. S.]]></given-names>
</name>
</person-group>
<source><![CDATA[The emergence of the data science profession]]></source>
<year>2016</year>
<publisher-name><![CDATA[Graduate School of Arts and Sciences, Columbia University]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Breternitz]]></surname>
<given-names><![CDATA[V. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Lopes]]></surname>
<given-names><![CDATA[F. S.]]></given-names>
</name>
<name>
<surname><![CDATA[Silva]]></surname>
<given-names><![CDATA[L. A. da.]]></given-names>
</name>
</person-group>
<source><![CDATA[Big Data/Analytics: formação e gestão de cientistas de dados]]></source>
<year>2015</year>
<conf-name><![CDATA[ CONTECSI - International Conference on Information Systems and Technology Management]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1-8</page-range></nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="">
<collab>Burtch Works</collab>
<source><![CDATA[The Burtch Works Study: Salaries of Data Science &amp; Analytics Professionals]]></source>
<year>2021</year>
</nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cao]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Data Science: Profession and Education]]></article-title>
<source><![CDATA[IEEE Intelligent Systems]]></source>
<year>2019</year>
<volume>34</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>35-44</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="">
<collab>Carrot2 Clustering Engine</collab>
<source><![CDATA[Clustering Workbench]]></source>
<year>2021</year>
</nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chapman]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Clinton]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Kerber]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Khabaza]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Reinartz]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Shearer]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Wirth]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[CRISP-DM 1.0: Step-by-step data mining guide]]></source>
<year>2000</year>
<publisher-name><![CDATA[SPSS]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ayala]]></surname>
<given-names><![CDATA[B. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Alsmadi]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Fundamentals of Data Science for Future Data Scientists]]></article-title>
<source><![CDATA[Analytics and Knowledge Management]]></source>
<year>2018</year>
<page-range>167-94</page-range><publisher-name><![CDATA[CRC Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cunha]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[Procuram-se cientistas de dados]]></source>
<year>2018</year>
</nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Curty]]></surname>
<given-names><![CDATA[R. G.]]></given-names>
</name>
<name>
<surname><![CDATA[Serafim]]></surname>
<given-names><![CDATA[J. D. S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A formação em ciência de dados: uma análise preliminar do panorama estadunidense]]></article-title>
<source><![CDATA[Informação &amp; Informação]]></source>
<year>2016</year>
<volume>21</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>307-31</page-range></nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Davenport]]></surname>
<given-names><![CDATA[T. H.]]></given-names>
</name>
<name>
<surname><![CDATA[Patil]]></surname>
<given-names><![CDATA[D. J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Data scientist: The sexiest job of the 21st century]]></article-title>
<source><![CDATA[Harvard Business Review]]></source>
<year>2012</year>
<volume>90</volume>
<numero>10</numero>
<issue>10</issue>
<page-range>5</page-range></nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Demchenko]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Belloum]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Los]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Wiktorski]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Manieri]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Brocks]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Becker]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Heutelbeck]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Hemmje]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Brewer]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[EDISON data science framework: A foundation for building data science profession for research and industry]]></article-title>
<source><![CDATA[Proceedings of the International Conference on Cloud Computing Technology and Science, CloudCom, 0(Dtw)]]></source>
<year>2016</year>
<page-range>620-6</page-range></nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Demchenko]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Belloum]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Wiktorski]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[EDISON Data Science Framework: Part 1. Data Science Competence Framework (CF-DS)]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dhar]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Data science and prediction]]></article-title>
<source><![CDATA[Communications of the ACM]]></source>
<year>2013</year>
<volume>56</volume>
<numero>12</numero>
<issue>12</issue>
<page-range>64-73</page-range></nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Finzer]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The Data Science Education Dilemma]]></article-title>
<source><![CDATA[Technology Innovations in Statistics Education]]></source>
<year>2013</year>
<volume>7</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>1-9</page-range></nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gajzler]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Text and data mining techniques in aspect of knowledge acquisition for decision support system in construction industry]]></article-title>
<source><![CDATA[Technological and Economic Development of Economy]]></source>
<year>2010</year>
<volume>16</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>219-32</page-range></nlm-citation>
</ref>
<ref id="B22">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gottipati]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Shim]]></surname>
<given-names><![CDATA[K. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Sahoo]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Glassdoor job description analytics - Analyzing data science professional roles and skills]]></source>
<year>2021</year>
<conf-name><![CDATA[ IEEE Global Engineering Education Conference, EDUCON]]></conf-name>
<conf-date>2021</conf-date>
<conf-loc> </conf-loc>
<page-range>1329-36</page-range></nlm-citation>
</ref>
<ref id="B23">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Grossi]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Giannotti]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Pedreschi]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Manghi]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Pagano]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Assante]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Data science: a game changer for science and innovation]]></article-title>
<source><![CDATA[International Journal of Data Science and Analytics]]></source>
<year>2021</year>
<volume>11</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>263-78</page-range></nlm-citation>
</ref>
<ref id="B24">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hall]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Phan]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Whitson]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<source><![CDATA[The Evolution of Analytics: Opportunities and Challenges for Machine Learning in Business]]></source>
<year>2016</year>
<publisher-name><![CDATA[O&#8217;Reilly Media, Inc]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B25">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Halwani]]></surname>
<given-names><![CDATA[M. A.]]></given-names>
</name>
<name>
<surname><![CDATA[Amirkiaee]]></surname>
<given-names><![CDATA[S. Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Evangelopoulos]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Prybutok]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Job qualifications study for data science and big data professions]]></article-title>
<source><![CDATA[Information Technology &amp; People]]></source>
<year>2021</year>
</nlm-citation>
</ref>
<ref id="B26">
<nlm-citation citation-type="">
<collab>Kaggle</collab>
<source><![CDATA[State of Machine Learning and Data Science 2021]]></source>
<year>2021</year>
</nlm-citation>
</ref>
<ref id="B27">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kim]]></surname>
<given-names><![CDATA[J. Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[C. K.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[An empirical analysis of requirements for data scientists using online job postings]]></article-title>
<source><![CDATA[International Journal of Software Engineering and Its Applications]]></source>
<year>2016</year>
<volume>10</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>161-72</page-range></nlm-citation>
</ref>
<ref id="B28">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lantz]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Machine Learning with R]]></source>
<year>2015</year>
<edition>2</edition>
<publisher-name><![CDATA[Packt Publishing]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B29">
<nlm-citation citation-type="">
<collab>LinkedIn</collab>
<article-title xml:lang=""><![CDATA[2020 Emerging Jobs Report]]></article-title>
<source><![CDATA[LinkedIn]]></source>
<year>2020</year>
</nlm-citation>
</ref>
<ref id="B30">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Loukides]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[What is data science? The future belongs to the companies and people that turn data into products]]></source>
<year>2012</year>
<publisher-name><![CDATA[O&#8217;Reilly Media, Inc]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B31">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mabey]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[pyLDAvis Documentation]]></source>
<year>2018</year>
</nlm-citation>
</ref>
<ref id="B32">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Metelo]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Bernardino]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Pedrosa]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Avaliação de Ferramentas Open Source para Data Science usando a Metodologia OSSpal]]></article-title>
<source><![CDATA[RISTI -Revista Ibérica de Sistemas e Tecnologias de Informação]]></source>
<year>2021</year>
<numero>E46</numero>
<issue>E46</issue>
<page-range>588-607</page-range></nlm-citation>
</ref>
<ref id="B33">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Meyer]]></surname>
<given-names><![CDATA[M. A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Healthcare data scientist qualifications, skills, and job focus: A content analysis of job postings]]></article-title>
<source><![CDATA[Journal of the American Medical Informatics Association]]></source>
<year>2019</year>
<volume>26</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>383-91</page-range></nlm-citation>
</ref>
<ref id="B34">
<nlm-citation citation-type="">
<collab>NIST Big Data Public Working Group</collab>
<source><![CDATA[NIST Big Data Interoperability Framework: Volume 1, Definitions]]></source>
<year>2015</year>
</nlm-citation>
</ref>
<ref id="B35">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Provost]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Fawcett]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Data Science and its Relationship to Big Data and Data-Driven Decision Making]]></article-title>
<source><![CDATA[Big Data]]></source>
<year>2013</year>
<volume>1</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>51-9</page-range></nlm-citation>
</ref>
<ref id="B36">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Provost]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Fawcett]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Data Science for Business: What you need to know about data mining and data-analytic thinking]]></source>
<year>2013</year>
<publisher-name><![CDATA[O&#8217;Reilly Media, Inc]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B37">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Raschka]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Python Machine Learning: Unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analytics]]></source>
<year>2016</year>
<publisher-name><![CDATA[Packt Publishing]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B38">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[&#344;eh&#367;&#345;ek]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Sojka]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Gensim-python framework for vector space modelling]]></article-title>
<source><![CDATA[NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic]]></source>
<year>2011</year>
<volume>3</volume>
<numero>2</numero>
<issue>2</issue>
</nlm-citation>
</ref>
<ref id="B39">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reis]]></surname>
<given-names><![CDATA[L. C. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Sá]]></surname>
<given-names><![CDATA[M. I. da F. e.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Big Data: Um novo campo de atuação para bibliotecários]]></article-title>
<source><![CDATA[Prisma.Com]]></source>
<year>2020</year>
<volume>41</volume>
<page-range>231-50</page-range></nlm-citation>
</ref>
<ref id="B40">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Saltz]]></surname>
<given-names><![CDATA[J. S.]]></given-names>
</name>
<name>
<surname><![CDATA[Grady]]></surname>
<given-names><![CDATA[N. W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The ambiguity of data science team roles and the need for a data science workforce framework]]></article-title>
<source><![CDATA[In Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017, 2018-Janua]]></source>
<year>2017</year>
<page-range>2355-61</page-range></nlm-citation>
</ref>
<ref id="B41">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sievert]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Shirley]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[LDAvis: A method for visualizing and interpreting topics]]></article-title>
<source><![CDATA[Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces]]></source>
<year>2014</year>
<page-range>63-70</page-range></nlm-citation>
</ref>
<ref id="B42">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stark]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Hawamdeh]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Relating Big Data and Data Science to the Wider Concept of Knowledge Management]]></article-title>
<source><![CDATA[Analytics and Knowledge Management]]></source>
<year>2018</year>
<page-range>141-66</page-range><publisher-name><![CDATA[CRC Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B43">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wesslen]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[Computer-Assisted Text Analysis for Social Science: Topic Models and Beyond]]></source>
<year>2018</year>
<publisher-name><![CDATA[ArXivLabs]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B44">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wolfram]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[A pesquisa bibliométrica na era do big data: Desafios e oportunidades. Bibliometria e Cientometria No Brasil: Infraestrutura Para Avaliação Da Pesquisa Científica Na Era Do Big Data]]></source>
<year>2017</year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
