<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>0873-3015</journal-id>
<journal-title><![CDATA[Millenium - Journal of Education, Technologies, and Health]]></journal-title>
<abbrev-journal-title><![CDATA[Mill]]></abbrev-journal-title>
<issn>0873-3015</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico de Viseu (IPV)]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S0873-30152024000100301</article-id>
<article-id pub-id-type="doi">10.29352/mill0223.31378</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[A methodological proposal to address the academic dropout phenomenon based on an intelligent prediction model: a case study]]></article-title>
<article-title xml:lang="pt"><![CDATA[Uma proposta metodológica para abordar o fenômeno da deserção académica a partir de um modelo de predição inteligente: um estudo de caso]]></article-title>
<article-title xml:lang="es"><![CDATA[Una propuesta metodológica para abordar el fenómeno de la deserción académica a partir de un modelo de predicción inteligente: un caso de estudio]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Villa-Murillo]]></surname>
<given-names><![CDATA[Adriana]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Costa]]></surname>
<given-names><![CDATA[Luís]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Vásquez]]></surname>
<given-names><![CDATA[Carlos]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Universidad Viña del Mar Escuela de Ciencias ]]></institution>
<addr-line><![CDATA[Viña del Mar ]]></addr-line>
<country>Chile</country>
</aff>
<aff id="Af2">
<institution><![CDATA[,Universidad Téncina de Ambato Facultad de Ciencias Agrícolas ]]></institution>
<addr-line><![CDATA[Ambato ]]></addr-line>
<country>Ecuador</country>
</aff>
<pub-date pub-type="pub">
<day>30</day>
<month>04</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="epub">
<day>30</day>
<month>04</month>
<year>2024</year>
</pub-date>
<numero>23</numero>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://scielo.pt/scielo.php?script=sci_arttext&amp;pid=S0873-30152024000100301&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.pt/scielo.php?script=sci_abstract&amp;pid=S0873-30152024000100301&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://scielo.pt/scielo.php?script=sci_pdf&amp;pid=S0873-30152024000100301&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract  Introduction:  University dropout is now considered a complex phenomenon that goes beyond the number of students not enrolled and that is continuously growing, especially in the first years of study.  Objective: In the present study, a prediction model combining Survival Analysis, Decision Trees, and Random Forest, under the Machine Learning philosophy, is proposed for the early diagnosis of possible factors causing dropout in university students.  Methods:  The proposal consists of 3 phases: the Survival Analysis that allows estimating the probability of permanence of the student (survival). Phase 2 starts from the probability value obtained in the previous phase and uses it as a response variable in the modeling process based on Decision Trees to establish survival patterns around the variables considered. Finally, in phase 3, the critical variables in the model are identified using the Random Forest.  Results:  The proposed methodology allowed the design of a prediction model to identify the main segmentation variables in behavior patterns of possible cases of academic dropout.  Conclusion:  Even though the proposal was developed considering a particular case of a Chilean university, the efficient combination of metaheuristics allows the extrapolation of the methodology to any context and academic reality. However, the conditions and needs of each institution must be considered.]]></p></abstract>
<abstract abstract-type="short" xml:lang="pt"><p><![CDATA[Resumo  Introdução:  A deserção universitária é atualmente considerada um fenômeno complexo que vai além do número de estudantes não matriculados, e que vem em contínuo crescimento sobretudo nos primeiros anos de estudo.  Objetivo: No presente estudo, é proposto um modelo de predição que combina Análise de Sobrevivência, Árvores de Decisão e Random Forest, sob a filosofia de Machine Learning, para o diagnóstico precoce dos possíveis fatores de deserção em estudantes universitários.  Métodos:  A proposta consiste em 3 fases: a Análise de Sobrevivência que permite estimar a probabilidade de permanência do aluno (sobrevivência). A fase 2 parte do valor de probabilidade obtido na fase anterior e o utiliza como variável resposta no processo de modelagem baseado em árvores de decisão para estabelecer padrões de sobrevivência em torno das variáveis &#8203;&#8203;consideradas. Finalmente, na fase 3, as variáveis &#8203;&#8203;críticas do modelo são identificadas usando Random Forest.  Resultados:  A metodologia proposta permitiu desenhar um modelo de previsão, que identifica as principais variáveis de segmentação em padrões de comportamento de possíveis casos de deserção acadêmica.  Conclusão:  Embora a proposta tenha sido desenvolvida a partir de um caso particular de uma universidade chilena, a combinação eficiente da meta-heurística permite a extrapolação da metodologia para qualquer contexto e realidade acadêmica. No entanto, devem ser consideradas as condições e necessidades de cada instituição.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[Resumen  Introducción:  La deserción universitaria se considera actualmente como un fenómeno complejo que va más alla del número de estudiantes no matriculados, y que viene en continuo crecimiento sobre todo en los primeros años de estudio.  Objetivo:  En el presente estudio se propone un modelo de predicción que combina el Análisis de Supervivencia, Árboles de Decisión y Random Forest, bajo la filosofía de Machine Learning, para el diagnóstico temprano de los posibles factores de la deserción en estudiantes universitarios.  Métodos:  La propuesta consta de 3 fases: el Análisis de Supervivencia que permite estimar la probabilidad de permanencia del alumno (supervivencia). La fase 2 parte del valor de probabilidad obtenido en la fase anterior y lo utiliza como variable respuesta en el proceso de modelado basado en los árboles de decisión para establecer patrones de supervivencia en torno a las variables consideradas. Finalmente, en la fase 3 se identifican las variables más importantes en el modelo, utilizando Random Forest.  Resultados:  La metodología propuesta permitió diseñar un modelo de predicción, que identifica las principales variables de segmentación en patrones de comportamiento de posibles casos de deserción académica.  Conclusión:  Si bien la propuesta fue desarrollada considerando un caso particular de una universidad chilena, la eficiente combinación de la metaheurística permite la extrapolación de la metodología a cualquier contexto y realidad académica. Sin embargo, se deben considerar las condiciones y necesidades de cada institución.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[case study]]></kwd>
<kwd lng="en"><![CDATA[dynamic modeling]]></kwd>
<kwd lng="en"><![CDATA[educational data mining]]></kwd>
<kwd lng="en"><![CDATA[metaheuristics]]></kwd>
<kwd lng="pt"><![CDATA[estudo de caso]]></kwd>
<kwd lng="pt"><![CDATA[modelo dinâmico]]></kwd>
<kwd lng="pt"><![CDATA[mineração de dados educacionais]]></kwd>
<kwd lng="pt"><![CDATA[metaheurística]]></kwd>
<kwd lng="es"><![CDATA[estudio de caso]]></kwd>
<kwd lng="es"><![CDATA[modelo dinámico]]></kwd>
<kwd lng="es"><![CDATA[minería de datos educativos]]></kwd>
<kwd lng="es"><![CDATA[metaheurística]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Acevedo]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Concepts and measurement of dropout in higher education: A critical perspective from Latin America]]></article-title>
<source><![CDATA[Issues in Educational Research]]></source>
<year>2021</year>
<volume>31</volume>
<page-range>661-78</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Agrusti]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Bonavolontà]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Mezzini]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[University dropout prediction through educational data mining techniques: A systematic review]]></article-title>
<source><![CDATA[Journal of E-Learning and Knowledge Society,]]></source>
<year>2019</year>
<volume>15</volume>
<page-range>161-82</page-range></nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bramer]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Principles of Data Mining]]></source>
<year>2016</year>
<publisher-loc><![CDATA[London ]]></publisher-loc>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Breiman]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Random Forests]]></article-title>
<source><![CDATA[Machine Learning]]></source>
<year>2001</year>
<volume>45</volume>
<page-range>5-32</page-range></nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Breiman]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Friedman]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Olsen]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Stone]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<source><![CDATA[Classification and Regression Trees, Encyclopedia of Data Warehousing and Mining]]></source>
<year>1984</year>
<publisher-loc><![CDATA[Monterey, California, U.S.A ]]></publisher-loc>
<publisher-name><![CDATA[Wadsworth, Inc]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dekker]]></surname>
<given-names><![CDATA[G. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Pechenizkiy]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Vleeshouwers]]></surname>
<given-names><![CDATA[J. M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Predicting students drop out: A case study]]></source>
<year>2009</year>
<conf-name><![CDATA[ EDM&#8217;09 - Educational Data Mining 2009: 2nd International Conference on Educational Data Mining]]></conf-name>
<conf-loc> </conf-loc>
<page-range>41-50</page-range></nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Feng]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Fan]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Analysis and Prediction of Students&#8217; Academic Performance Based on Educational Data Mining]]></article-title>
<source><![CDATA[IEEE Access, IEEE]]></source>
<year>2022</year>
<volume>10</volume>
<page-range>19558-71</page-range></nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[González]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Galvis]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Hurtado]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[La distribución Beta Generalizada como un modelo de sobrevivencia para analizar la evasión universitaria]]></article-title>
<source><![CDATA[Estudios pedagógicos]]></source>
<year>2014</year>
<volume>40</volume>
<page-range>133-44</page-range></nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hastie]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Tibshirani]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Friedman]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[The elements of Statistical learning: data mining, inference, and prediction]]></source>
<year>2009</year>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Iam-On]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Boongoen]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Improved student dropout prediction in Thai University using an ensemble of mixed-type data clusterings]]></article-title>
<source><![CDATA[International Journal of Machine Learning and Cybernetics, Springer Berlin Heidelberg]]></source>
<year>2017</year>
<volume>8</volume>
<page-range>497-510</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kleinbaum]]></surname>
<given-names><![CDATA[D. G.]]></given-names>
</name>
<name>
<surname><![CDATA[Klein]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Statistics for Biology and Health, Survival Analysis: a self-learning text]]></source>
<year>2012</year>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kubat]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[An Introduction to Machine Learning]]></source>
<year>2017</year>
<publisher-loc><![CDATA[Cham, Switzerland ]]></publisher-loc>
<publisher-name><![CDATA[Springer International Publishing]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Miranda]]></surname>
<given-names><![CDATA[M. A.]]></given-names>
</name>
<name>
<surname><![CDATA[Guzmán]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Análisis de la deserción de estudiantes universitarios usando técnicas de minería de dato]]></article-title>
<source><![CDATA[Formacion Universitaria]]></source>
<year>2017</year>
<volume>10</volume>
<page-range>61-8</page-range></nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Munizaga]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Cifuentes]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Beltrán]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Retención y abandono estudiantil en la Educación Superior Universitaria en América Latina y el Caribe: una revisión sistemática]]></article-title>
<source><![CDATA[Archivos Analíticos de Políticas Educativas]]></source>
<year>2018</year>
<volume>26</volume>
<page-range>1-31</page-range></nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pérez-Gutiérrez]]></surname>
<given-names><![CDATA[B. R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Comparación de técnicas de minería de datos para identificar indicios de deserción estudiantil, a partir del desempeño académico]]></article-title>
<source><![CDATA[Revista UIS Ingenierías]]></source>
<year>2020</year>
<volume>19</volume>
<page-range>193-204</page-range></nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="">
<collab>R Core Team</collab>
<source><![CDATA[R: A language and environment for statistical computing. R Foundation for Statistical Computing]]></source>
<year>2018</year>
</nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Segal]]></surname>
<given-names><![CDATA[M. R.]]></given-names>
</name>
</person-group>
<source><![CDATA[Machine Learning Benchmarks and Random Forest Regression, UCSF: Center for Bioinformatics and Molecular Biostatistics]]></source>
<year>2004</year>
</nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Siroky]]></surname>
<given-names><![CDATA[D. S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Navigating random forests and related advances in algorithmic modeling]]></article-title>
<source><![CDATA[Statistics Surveys]]></source>
<year>2009</year>
<volume>3</volume>
<page-range>147-63</page-range></nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Torrado Fonseca]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Figuera Gazo]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Estudio longitudinal del proceso de abandono y reingreso de estudiantes de Ciencias Sociales. El caso de Administración y Dirección de Empresas]]></article-title>
<source><![CDATA[Educar]]></source>
<year>2019</year>
<volume>55</volume>
<page-range>401-17</page-range></nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Villa-Murillo]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Optimización del diseño de parametros metodos Forest-Genetic]]></article-title>
<source><![CDATA[Universitat Politecnica de Valencia]]></source>
<year>2012</year>
</nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Yamao]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Saavedra]]></surname>
<given-names><![CDATA[L. C.]]></given-names>
</name>
<name>
<surname><![CDATA[Campos Pérez]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[De Jesús]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Hurtado]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Prediction of academic performance using data mining in first year students of peruvian university]]></article-title>
<source><![CDATA[Revista Campus]]></source>
<year>2018</year>
<volume>23</volume>
<page-range>151-60</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
