<?xml version="1.0" encoding="UTF-8"?><mets:mets xmlns:mads="http://www.loc.gov/mads/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:tef="http://www.abes.fr/abes/documents/tef" xmlns:metsRights="http://cosimo.stanford.edu/sdr/metsrights/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mets="http://www.loc.gov/METS/">
    <mets:metsHdr ID="rennes1-ori-wf-1-16780" CREATEDATE="2022-06-28T10:42:46" LASTMODDATE="2022-06-28T10:42:46">
  <mets:agent ROLE="CREATOR">
            <mets:name>Université de Rennes 1</mets:name>
        </mets:agent>
</mets:metsHdr>
    <mets:dmdSec ID="desc_expr" CREATED="2022-06-28T10:42:46">
  <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="tef_desc_these">
            <mets:xmlData>
                <tef:thesisRecord>
     <dc:title xml:lang="fr">Caractérisation de registres de langue par extraction de motifs séquentiels émergents</dc:title>
     <dcterms:alternative xml:lang="en">Characterisation of language registers using emerging sequential pattern extraction</dcterms:alternative>
     <dc:subject xml:lang="fr">registres de langues</dc:subject><dc:subject xml:lang="fr">traitement automatique des langues</dc:subject><dc:subject xml:lang="fr">motifs séquentiels</dc:subject>
     <dc:subject xml:lang="en">Language registers</dc:subject><dc:subject xml:lang="en">natural language processing</dc:subject><dc:subject xml:lang="en">sequential patterns</dc:subject><tef:sujetRameau><tef:vedetteRameauNomCommun>
						<tef:elementdEntree autoriteSource="Sudoc" autoriteExterne="027326489">Linguistique -- Informatique</tef:elementdEntree>
					</tef:vedetteRameauNomCommun><tef:vedetteRameauNomCommun>
						<tef:elementdEntree autoriteSource="Sudoc" autoriteExterne="033280649">Niveaux de langue</tef:elementdEntree>
					</tef:vedetteRameauNomCommun></tef:sujetRameau>
     <dcterms:abstract xml:lang="fr">Cette thèse s'intéresse à la caractérisation automatique des registres de langue. Sur le plan linguistique, notre contribution est d'étudier les apports des techniques de traitement automatique des langues pour extraire de nouvelles connaissances à propos des registres familier, courant et soutenu. Sur le plan informatique, nous avons proposé une méthode suffisamment générique et non supervisée pour caractériser tout type de variation linguistique, les registres s'apparentant alors à un cas d'usage.  Dans le manuscrit, nous dressons tout d'abord un état des lieux des multiples différentes définitions présentes dans la littérature, par rapport auquel nous positionnons nos travaux.  Nous présentons alors la constitution linguistiquement motivée d'un large corpus de tweets en français annotés en registres. Les annotations résultent d'un procédé semi-supervisé fondé sur une graine annotée manuellement en registres et un classifieur qui généraliste les annotations à l’ensemble des tweets.  À partir de ce corpus annoté, nous montrons ensuite que l'emploi de techniques d’extraction de motifs séquentiels émergents permet d'extraire des traits linguistiques caractéristiques des registres étudiés.  Enfin, nous détaillons notre approche pour réduire le nombre de motifs extraits en vue d'une meilleure interprétabilité des caractérisations produites.</dcterms:abstract>
     <dcterms:abstract xml:lang="en">This PhD thesis aims at automatically characterising language registers. From a linguistic point of view, our contribution is to study the potential of natural language processing techniques to extract new knowledge about the casual, neutral, and formal registers. On the computational side, we have proposed a sufficiently generic and unsupervised method to characterise any type of linguistic variation, the registers then being similar to a use case.  The manuscript first draws up an inventory of the many different definitions present in the literature, against which we position our work.  Second, the constitution of a large lingustically-motivated corpus of French tweets annotated in registers is presented. The annotations result from a semi-supervised process based on a seed manually annotated in registers and a classifier that generalizes the annotations to all the tweets.  Based on this annotated corpus, we then show that the use of emergent sequential pattern extraction techniques enables the extraction of linguistic peculiarities of the registers under study.  Finally, we detail our approach for reducing the number of extracted patterns, which allows a better interpretability of the characterizations produced.</dcterms:abstract>
     <dc:type>Electronic Thesis or Dissertation</dc:type><dc:type xsi:type="dcterms:DCMIType">Text</dc:type>
     <dc:language xsi:type="dcterms:RFC3066">fr</dc:language>
    </tef:thesisRecord>
            </mets:xmlData>
        </mets:mdWrap>
</mets:dmdSec>
    <mets:dmdSec ID="desc_edition" CREATED="2022-06-28T10:42:46">
  <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="tef_desc_edition">
            <mets:xmlData>
                <tef:edition><dcterms:medium xsi:type="dcterms:IMT">application/pdf</dcterms:medium><dcterms:extent>1 : 7784 Ko</dcterms:extent><dc:identifier xsi:type="dcterms:URI">https://ged.univ-rennes1.fr/nuxeo/site/esupversions/cec2a53d-cccf-4adb-bb0c-5e6269e30f1e</dc:identifier></tef:edition>
            </mets:xmlData>
        </mets:mdWrap>
</mets:dmdSec>
    <mets:amdSec>
        <mets:techMD ID="admin_expr" CREATED="">
            <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="tef_admin_these">
                <mets:xmlData>
                    <tef:thesisAdmin>
                        <tef:auteur>
       <tef:nom>Mekki</tef:nom>
       <tef:prenom>Jade</tef:prenom>
       
       <tef:dateNaissance>1993-12-09</tef:dateNaissance>
       <tef:nationalite scheme="ISO-3166-1">FR</tef:nationalite>
       <tef:autoriteExterne autoriteSource="Sudoc">269295178</tef:autoriteExterne>
       <tef:autoriteExterne autoriteSource="mailPerso">jade.mekki@gmail.com</tef:autoriteExterne>
      </tef:auteur>
                        <dc:identifier xsi:type="tef:NNT">2022REN1S098</dc:identifier>
                        <dc:identifier xsi:type="tef:nationalThesisPID">http://www.theses.fr/2022REN1S098</dc:identifier>
                        <dcterms:dateAccepted xsi:type="dcterms:W3CDTF">2022-09-08</dcterms:dateAccepted>
                        <tef:thesis.degree>
                            <tef:thesis.degree.discipline xml:lang="fr">Informatique</tef:thesis.degree.discipline>
                            <tef:thesis.degree.grantor>
        <tef:nom>Universite de Rennes 1</tef:nom><tef:autoriteInterne>thesis.degree.grantor_1</tef:autoriteInterne>
        
        <tef:autoriteExterne autoriteSource="Sudoc">02778715X</tef:autoriteExterne>
       </tef:thesis.degree.grantor>
                            <tef:thesis.degree.level>Doctorat</tef:thesis.degree.level>
                        </tef:thesis.degree>
                        <tef:theseSurTravaux>non</tef:theseSurTravaux>
                        <tef:avisJury>oui</tef:avisJury><tef:directeurThese><tef:nom>Lolive</tef:nom><tef:prenom>Damien</tef:prenom><tef:autoriteInterne>intervenant_1</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">13017498X</tef:autoriteExterne></tef:directeurThese><tef:directeurThese><tef:nom>Lecorvé‎</tef:nom><tef:prenom>Gwénolé</tef:prenom><tef:autoriteInterne>intervenant_2</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">150245254</tef:autoriteExterne></tef:directeurThese><tef:directeurThese><tef:nom>Battistelli</tef:nom><tef:prenom>Delphine</tef:prenom><tef:autoriteInterne>intervenant_6</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">060895217</tef:autoriteExterne></tef:directeurThese><tef:presidentJury><tef:nom>Antoine</tef:nom><tef:prenom>Jean-Yves</tef:prenom><tef:autoriteInterne>intervenant_3</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">137158319</tef:autoriteExterne></tef:presidentJury><tef:membreJury><tef:nom>Baude</tef:nom><tef:prenom>Olivier</tef:prenom><tef:autoriteInterne>intervenant_7</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">105682640</tef:autoriteExterne></tef:membreJury><tef:membreJury><tef:nom>Legallois</tef:nom><tef:prenom>Dominique</tef:prenom><tef:autoriteInterne>intervenant_8</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">109894847</tef:autoriteExterne></tef:membreJury><tef:rapporteur><tef:nom>Benamara</tef:nom><tef:prenom>Farah</tef:prenom><tef:autoriteInterne>intervenant_4</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">084066830</tef:autoriteExterne></tef:rapporteur><tef:rapporteur><tef:nom>Charnois</tef:nom><tef:prenom>Thierry</tef:prenom><tef:autoriteInterne>intervenant_5</tef:autoriteInterne><tef:autoriteExterne autoriteSource="Sudoc">168705117</tef:autoriteExterne></tef:rapporteur>
      
      
                        
                        
                        <tef:ecoleDoctorale>
       <tef:nom>MATHSTIC</tef:nom><tef:autoriteInterne>ecoleDoctorale_1</tef:autoriteInterne>
       
       <tef:autoriteExterne autoriteSource="Sudoc">204770424</tef:autoriteExterne>
      </tef:ecoleDoctorale><tef:partenaireRecherche type="laboratoire">
							<tef:nom>
IRISA
</tef:nom><tef:autoriteInterne>partenaireRecherche_1</tef:autoriteInterne>
							
							<tef:autoriteExterne autoriteSource="Sudoc">
026386909
</tef:autoriteExterne>
						</tef:partenaireRecherche>
                        <tef:oaiSetSpec>ddc:004</tef:oaiSetSpec>
                        
                        
                    <tef:MADSAuthority authorityID="intervenant_1" type="personal"><tef:personMADS><mads:namePart type="family">Lolive</mads:namePart><mads:namePart type="given">Damien</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="intervenant_2" type="personal"><tef:personMADS><mads:namePart type="family">Lecorvé‎</mads:namePart><mads:namePart type="given">Gwénolé</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="intervenant_3" type="personal"><tef:personMADS><mads:namePart type="family">Antoine</mads:namePart><mads:namePart type="given">Jean-Yves</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="intervenant_4" type="personal"><tef:personMADS><mads:namePart type="family">Benamara</mads:namePart><mads:namePart type="given">Farah</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="intervenant_5" type="personal"><tef:personMADS><mads:namePart type="family">Charnois</mads:namePart><mads:namePart type="given">Thierry</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="intervenant_6" type="personal"><tef:personMADS><mads:namePart type="family">Battistelli</mads:namePart><mads:namePart type="given">Delphine</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="intervenant_7" type="personal"><tef:personMADS><mads:namePart type="family">Baude</mads:namePart><mads:namePart type="given">Olivier</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="intervenant_8" type="personal"><tef:personMADS><mads:namePart type="family">Legallois</mads:namePart><mads:namePart type="given">Dominique</mads:namePart></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="thesis.degree.grantor_1" type="corporate"><tef:personMADS><mads:namePart>Universite de Rennes 1</mads:namePart><mads:description>Sciences et technologie, medecine, pharmacie, odontologie, droit, economie, gestion, philosophie</mads:description></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="ecoleDoctorale_1" type="corporate"><tef:personMADS><mads:namePart>MATHSTIC</mads:namePart><mads:description>École doctorale Mathématiques et sciences et technologies de l'information et de la communication (Rennes)</mads:description></tef:personMADS></tef:MADSAuthority><tef:MADSAuthority authorityID="partenaireRecherche_1" type="corporate"><tef:personMADS><mads:namePart>
IRISA
</mads:namePart></tef:personMADS></tef:MADSAuthority></tef:thesisAdmin>
                </mets:xmlData>
            </mets:mdWrap>
        </mets:techMD><mets:techMD ID="file_1"><mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="tef_tech_fichier"><mets:xmlData><tef:meta_fichier>
     <tef:encodage>ASCII</tef:encodage>
     <tef:formatFichier>PDF</tef:formatFichier>
     
     
     
     <tef:taille>7970363</tef:taille>
    </tef:meta_fichier></mets:xmlData></mets:mdWrap></mets:techMD>
        
        <mets:rightsMD ID="dr_expr_thesard" CREATED="">
            <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="tef_droits_auteur_these">
                <mets:xmlData>
                    <metsRights:RightsDeclarationMD>
                        <metsRights:Context CONTEXTCLASS="GENERAL PUBLIC">
                            <metsRights:Permissions DISCOVER="true" DISPLAY="true" COPY="true" DUPLICATE="true" MODIFY="false" DELETE="false" PRINT="true"/>
                        </metsRights:Context>
                    </metsRights:RightsDeclarationMD>
                </mets:xmlData>
            </mets:mdWrap>
        </mets:rightsMD>
        <mets:rightsMD ID="dr_expr_univ" CREATED="">
            <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="tef_droits_etablissement_these">
                <mets:xmlData>
                    <metsRights:RightsDeclarationMD>
                        <metsRights:Context CONTEXTCLASS="GENERAL PUBLIC">
                            <metsRights:Permissions DISCOVER="true" DISPLAY="true" COPY="true" DUPLICATE="true" MODIFY="false" DELETE="false" PRINT="true"/>
                        </metsRights:Context>
                    </metsRights:RightsDeclarationMD>
                </mets:xmlData>
            </mets:mdWrap>
        </mets:rightsMD>
        <mets:rightsMD ID="dr_version" CREATED="">
            <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="tef_droits_version">
                <mets:xmlData>
                    <metsRights:RightsDeclarationMD>
                        <metsRights:Context CONTEXTCLASS="GENERAL PUBLIC">
                            <metsRights:Permissions DISCOVER="true" DISPLAY="true" COPY="true" DUPLICATE="true" MODIFY="false" DELETE="false" PRINT="true"/>
                        </metsRights:Context>
                    </metsRights:RightsDeclarationMD>
                </mets:xmlData>
            </mets:mdWrap>
        </mets:rightsMD>
    </mets:amdSec>
    <mets:fileSec>
  <mets:fileGrp ID="FGrID1" USE="archive"><mets:file ID="FID1" ADMID="file_1" MIMETYPE="application/pdf" USE="maitre"><mets:FLocat LOCTYPE="URL" xlink:href="https://ged.univ-rennes1.fr/nuxeo/site/esupversions/cec2a53d-cccf-4adb-bb0c-5e6269e30f1e"/></mets:file></mets:fileGrp>
 </mets:fileSec>
    <mets:structMap TYPE="logical">
        <mets:div DMDID="desc_expr" ADMID="dr_expr_thesard dr_expr_univ admin_expr" TYPE="THESE" CONTENTIDS="http://ori-oai-search.univ-rennes1.fr/uid/rennes1-ori-wf-1-16780/oeuvre">
            <mets:div ADMID="dr_version" TYPE="VERSION_COMPLETE" CONTENTIDS="http://ori-oai-search.univ-rennes1.fr/uid/rennes1-ori-wf-1-16780/oeuvre/version">
                <mets:div DMDID="desc_edition" TYPE="EDITION" CONTENTIDS="http://ori-oai-search.univ-rennes1.fr/uid/rennes1-ori-wf-1-16780/oeuvre/version/edition">
                    <mets:fptr FILEID="FGrID1"/>
                </mets:div>
            </mets:div>
        </mets:div>
    </mets:structMap>
</mets:mets>