"Jd,ddlmZmZmZddlmZddlmZm Z ddl m Z ddl m Z ddl mZddl mZmZdd l mZmZmZdd l mZmZdd l mZdd lmZdd lmZeeZe dkreZne ZGddeZdS))absolute_importdivisionunicode_literals)unichr)deque OrderedDict) version_info)spaceCharacters)entities) asciiLettersasciiUpper2Lower)digits hexDigitsEOF) tokenTypes tagTokenTypes)replacementCharacters)HTMLInputStream)Trie)ceZdZdZdMfd ZdZdZdNdZdZd Z d Z d Z d Z d Z dZdZdZdZdZdZdZdZdZdZdZdZdZdZdZdZdZdZd Z d!Z!d"Z"d#Z#d$Z$d%Z%d&Z&d'Z'd(Z(d)Z)d*Z*d+Z+d,Z,d-Z-d.Z.d/Z/d0Z0d1Z1d2Z2d3Z3d4Z4d5Z5d6Z6d7Z7d8Z8d9Z9d:Z:d;Z;d<ZZ>d?Z?d@Z@dAZAdBZBdCZCdDZDdEZEdFZFdGZGdHZHdIZIdJZJdKZKdLZLxZMS)O HTMLTokenizera  This class takes care of tokenizing HTML. * self.currentToken Holds the token that is currently being processed. * self.state Holds a reference to the method to be invoked... XXX * self.stream Points to HTMLInputStream object. Nc t|fi||_||_d|_g|_|j|_d|_d|_tt| dSNF) rstreamparser escapeFlag lastFourChars dataStatestateescape currentTokensuperr__init__)selfrrkwargs __class__s R/opt/alt/python311/lib/python3.11/site-packages/pip/_vendor/html5lib/_tokenizer.pyr&zHTMLTokenizer.__init__(sn%f7777   ^  ! mT""++-----c#fKtg|_|r|jjr;t d|jjddV|jj;|jr"|jV|j"|dSdS)z This is where the magic happens. We do our usually processing through the states and when we have a token to return we yield the token which pauses processing until the next token is requested. ParseErrorrtypedataN)r tokenQueuer"rerrorsrpoppopleftr's r*__iter__zHTMLTokenizer.__iter__7s ))jjll 0+$ \),7ASAWAWXYAZAZ[[[[[+$ \/ 0o--////// 0jjll 0 0 0 0 0r+clt}d}|r t}d}g}|j}||vrD|tur;|||j}||vr |tu;t d||}|tvr:t|}|j tddd|idnd|cxkrd ksn|d kr.d }|j tddd|idnd |cxkrd ksBnd|cxkrdks3nd|cxkrdks$nd|cxkrdksn|tgdvr+|j tddd|id t|}n@#t$r3|dz }td|dz ztd|dzzz}YnwxYw|dkrB|j tddd|j||S)zThis function returns either U+FFFD or the character based on the decimal or hexadecimal representation. It also discards ";" if present. If not present self.tokenQueue.append({"type": tokenTypes["ParseError"]}) is invoked. r-z$illegal-codepoint-for-numeric-entity charAsIntr/r0datavarsii�r ii)# iiiiiiiiiiiiiiiiiii i i i i i i i i i iiiiir>iii;z numeric-entity-without-semicolonr.)rrrcharrappendintjoinrr1r frozensetchr ValueErrorunget) r'isHexallowedradix charStackcr;rGvs r*consumeNumberEntityz!HTMLTokenizer.consumeNumberEntityGs   GE  K    7llq||   Q      ""A7llq||  **E22  - - -(3D O " "J|,D$J1S((O**J|4L,L,N,NOOOrNc))m)|, <<|,66|,33K%%immoo666 2779#5#55FF%j1FK%%immoo666bggi &>???FF&& <0H(?(A(ABBB !!)--//222rwwy111  T  f %b )! , , , 6 , , , , ,((- ( O " "Jy,A6#R#R S S S S Ss!AI&& I54I5c4||ddS)zIThis method replaces the need for "entityInAttributeValueState". T)rerfN)rl)r'res r*processEntityInAttributez&HTMLTokenizer.processEntityInAttributes# {$GGGGGr+c|j}|dtvr |dt|d<|dtdkrZ|d}t |}t |t |kr||ddd||d<|dtdkr`|dr(|j tdd d |d r(|j tdd d |j ||j |_ dS) zThis method is a generic handler for emitting the tags. It also sets the state to "data" because that's what's needed after a token has been emitted. r/nameStartTagr0NrZEndTagr-zattributes-in-end-tagr. selfClosingzself-closing-flag-on-end-tag) r$r translaterr attributeMaprcupdater1rHr!r")r'tokenrawr0s r*emitCurrentTokenzHTMLTokenizer.emitCurrentTokensU ! &M] * *!&M334DEEE&MV} : 666Fm#C((s88c$ii''KKDDbD *** $f V} 8 444=NO**J|4L4K,M,MNNN'UO**J|4L4R,T,TUUU u%%%^ r+cz|j}|dkr|j|_n |dkr |j|_n|dkrQ|jtddd|jtdddn|turdS|tvrJ|jtd ||j td zdnE|j d }|jtd||zdd S) NrWrXr-invalid-codepointr.r_Fr^TrWrXr{) rrGentityDataStater" tagOpenStater1rHrrr charsUntilr'r0charss r*r!zHTMLTokenizer.dataStatesl{!! 3;;-DJJ S[[*DJJ X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7 7 S[[5 _ $ $ O " "J7H,I$(4;+A+A/SW+X+X$X$Z$Z [ [ [ [ K**+?@@E O " "J|,D$(5L$2$2 3 3 3tr+cF||j|_dSNT)rlr!r"r5s r*r~zHTMLTokenizer.entityDataStates" ^ tr+c~|j}|dkr|j|_n|dkr |j|_n|t krdS|dkrQ|jtddd|jtdd dn|tvrJ|jtd ||j td zdnE|j d }|jtd||zdd S) NrWrXFr{r-r|r.r_r?r^Tr}) rrGcharacterReferenceInRcdatar"rcdataLessThanSignStaterr1rHrr rrs r* rcdataStatezHTMLTokenizer.rcdataState"sl{!! 3;;8DJJ S[[5DJJ S[[5 X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7 7 _ $ $ O " "J7H,I$(4;+A+A/SW+X+X$X$Z$Z [ [ [ [ K**+?@@E O " "J|,D$(5L$2$2 3 3 3tr+cF||j|_dSr)rlrr"r5s r*rz(HTMLTokenizer.characterReferenceInRcdata?s# % tr+c|j}|dkr |j|_n|dkrQ|jt ddd|jt dddnR|tkrdS|jd }|jt d||zdd S NrXr{r-r|r.r_r?F)rXr{T) rrGrawtextLessThanSignStater"r1rHrrrrs r* rawtextStatezHTMLTokenizer.rawtextStateDs{!! 3;;6DJJ X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7 7 S[[5K**?;;E O " "J|,D$(5L$2$2 3 3 3tr+c|j}|dkr |j|_n|dkrQ|jt ddd|jt dddnR|tkrdS|jd }|jt d||zdd Sr) rrGscriptDataLessThanSignStater"r1rHrrrrs r*scriptDataStatezHTMLTokenizer.scriptDataStateVs{!! 3;;9DJJ X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7 7 S[[5K**?;;E O " "J|,D$(5L$2$2 3 3 3tr+c|j}|tkrdS|dkrQ|jt ddd|jt dddnC|jt d||jdzddS) NFr{r-r|r.r_r?T)rrGrr1rHrrr'r0s r*plaintextStatezHTMLTokenizer.plaintextStatehs{!! 3;;5 X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7 7 O " "J|,D$(4;+A+A(+K+K$K$M$M N N Ntr+cB|j}|dkr|j|_nq|dkr|j|_n]|t vr&t d|gddd|_|j|_n.|dkr]|j t ddd |j t d d d |j |_n|d krO|j t dd d |j ||j |_nv|j t ddd |j t d dd |j ||j |_dS)N!/rqF)r/rpr0rsselfClosingAcknowledged>r-z'expected-tag-name-but-got-right-bracketr.r_z<>?z'expected-tag-name-but-got-question-markzexpected-tag-namerXT)rrGmarkupDeclarationOpenStater"closeTagOpenStater rr$ tagNameStater1rHr!rNbogusCommentStaters r*rzHTMLTokenizer.tagOpenStatews{!! 3;;8DJJ S[[/DJJ \ ! !)3J)?)-r05|jtddd|j dxxd z cc<n|j dxx|z cc<d S) Nrr-zeof-in-tag-namer.rr{r|rpr?T) rrGr beforeAttributeNameStater"ryrr1rHrr!selfClosingStartTagStater$rs r*rzHTMLTokenizer.tagNameStates'{!! ? " "6DJJ S[[  ! ! # # # # S[[ O " "J|,D$5$7$7 8 8 8DJJ S[[6DJJ X   O " "J|,D,?$A$A B B B  f % % % 1 % % % %  f % % % - % % %tr+c|j}|dkrd|_|j|_nN|jtddd|j||j |_dSNrr:r_rXr.T) rrGtemporaryBufferrcdataEndTagOpenStater"r1rHrrNrrs r*rz%HTMLTokenizer.rcdataLessThanSignStatesz{!! 3;;#%D 3DJJ O " "J|,Dc#R#R S S S K  d # # #)DJtr+c |j}|tvr|xj|z c_|j|_nN|jtddd|j ||j |_dSNr_rr.T) rrGr rrcdataEndTagNameStater"r1rHrrNrrs r*rz#HTMLTokenizer.rcdataEndTagOpenStates{!! <    D ( 3DJJ O " "J|,Dd#S#S T T T K  d # # #)DJtr+c|jo9|jd|jk}|j}|t vr+|r)t d|jgdd|_|j|_n|dkr+|r)t d|jgdd|_|j |_n|dkr?|r=t d|jgdd|_| |j |_np|tvr|xj|z c_nV|j t dd|jzd |j||j|_d S NrprrFrrrr_rr.T)r$lowerrrrGr rrr"rryr!r r1rHrNrr' appropriater0s r*rz#HTMLTokenizer.rcdataEndTagNameStates'mD,=f,E,K,K,M,MQUQeQkQkQmQm,m {!! ? " "{ ")3H)=)-)=)+E!C!CD 6DJJ S[[[[)3H)=)-)=)+E!C!CD 6DJJ S[[[[)3H)=)-)=)+E!C!CD   ! ! # # #DJJ \ ! !  D ( O " "J|,D,043G,G$I$I J J J K  d # # #)DJtr+c|j}|dkrd|_|j|_nN|jtddd|j||j |_dSr) rrGrrawtextEndTagOpenStater"r1rHrrNrrs r*rz&HTMLTokenizer.rawtextLessThanSignStatesz{!! 3;;#%D 4DJJ O " "J|,Dc#R#R S S S K  d # # #*DJtr+c |j}|tvr|xj|z c_|j|_nN|jtddd|j ||j |_dSr) rrGr rrawtextEndTagNameStater"r1rHrrNrrs r*rz$HTMLTokenizer.rawtextEndTagOpenStates{!! <    D ( 4DJJ O " "J|,Dd#S#S T T T K  d # # #*DJtr+c|jo9|jd|jk}|j}|t vr+|r)t d|jgdd|_|j|_n|dkr+|r)t d|jgdd|_|j |_n|dkr?|r=t d|jgdd|_| |j |_np|tvr|xj|z c_nV|j t dd|jzd |j||j|_d Sr)r$rrrrGr rrr"rryr!r r1rHrNrrs r*rz$HTMLTokenizer.rawtextEndTagNameStates'mD,=f,E,K,K,M,MQUQeQkQkQmQm,m {!! ? " "{ ")3H)=)-)=)+E!C!CD 6DJJ S[[[[)3H)=)-)=)+E!C!CD 6DJJ S[[[[)3H)=)-)=)+E!C!CD   ! ! # # #DJJ \ ! !  D ( O " "J|,D,043G,G$I$I J J J K  d # # #*DJtr+c~|j}|dkrd|_|j|_n|dkr5|jtddd|j|_nN|jtddd|j ||j |_dS) Nrr:rr_zDJJ \ ! ! O " "J|,DcTXj#Y#Y Z Z Z#'D >DJJ O " "J|,Dc#R#R S S S K  d # # #4DJtr+c|j}|tvr||_|j|_nN|jtddd|j ||j |_dSr) rrGr r scriptDataEscapedEndTagNameStater"r1rHrrNrrs r*rz.HTMLTokenizer.scriptDataEscapedEndTagOpenStates|{!! <  #'D >DJJ O " "J|,Dd#S#S T T T K  d # # #4DJtr+c|jo9|jd|jk}|j}|t vr+|r)t d|jgdd|_|j|_n|dkr+|r)t d|jgdd|_|j |_n|dkr?|r=t d|jgdd|_| |j |_np|tvr|xj|z c_nV|j t dd|jzd |j||j|_d Sr)r$rrrrGr rrr"rryr!r r1rHrNrrs r*rz.HTMLTokenizer.scriptDataEscapedEndTagNameStates'mD,=f,E,K,K,M,MQUQeQkQkQmQm,m {!! ? " "{ ")3H)=)-)=)+E!C!CD 6DJJ S[[[[)3H)=)-)=)+E!C!CD 6DJJ S[[[[)3H)=)-)=)+E!C!CD   ! ! # # #DJJ \ ! !  D ( O " "J|,D,043G,G$I$I J J J K  d # # #4DJtr+c|j}|ttdzvr_|jt d|d|jdkr |j |_ nu|j |_ nh|tvr9|jt d|d|xj|z c_n&|j ||j |_ dSN)rrr_r.scriptT)rrGr rKr1rHrrrscriptDataDoubleEscapedStater"rr rNrs r*rz.HTMLTokenizer.scriptDataDoubleEscapeStartStates{!! Oi &;&;; < < O " "J|,Dd#S#S T T T#))++x77!> !8 \ ! ! O " "J|,Dd#S#S T T T  D ( K  d # # #4DJtr+c|j}|dkr5|jtddd|j|_n|dkr5|jtddd|j|_n|dkrQ|jtddd|jtdddnh|tkr5|jtdd d|j |_n(|jtd|dd S Nrr_r.rXr{r-r|r?eof-in-script-in-scriptT) rrGr1rHr scriptDataDoubleEscapedDashStater"(scriptDataDoubleEscapedLessThanSignStaterr!rs r*rz*HTMLTokenizer.scriptDataDoubleEscapedStatesc{!! 3;; O " "J|,Dc#R#R S S S>DJJ S[[ O " "J|,Dc#R#R S S SFDJJ X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7 7 S[[ O " "J|,D$=$?$? @ @ @DJJ O " "J|,Dd#S#S T T Ttr+c|j}|dkr6|jtddd|j|_n|dkr5|jtddd|j|_n|dkr]|jtddd|jtddd|j|_nt|tkr5|jtdd d|j |_n4|jtd|d|j|_d Sr) rrGr1rHr$scriptDataDoubleEscapedDashDashStater"rrrr!rs r*rz.HTMLTokenizer.scriptDataDoubleEscapedDashStatest{!! 3;; O " "J|,Dc#R#R S S SBDJJ S[[ O " "J|,Dc#R#R S S SFDJJ X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7:DJJ S[[ O " "J|,D$=$?$? @ @ @DJJ O " "J|,Dd#S#S T T T:DJtr+c4|j}|dkr*|jtdddnN|dkr6|jtddd|j|_n|dkr5|jtddd|j|_n|dkr]|jtddd|jtdd d|j|_nt|tkr5|jtdd d|j |_n4|jtd|d|j|_d S) Nrr_r.rXrr{r-r|r?rT) rrGr1rHrrr"rrrr!rs r*rz2HTMLTokenizer.scriptDataDoubleEscapedDashDashState%s{!! 3;; O " "J|,Dc#R#R S S S S S[[ O " "J|,Dc#R#R S S SFDJJ S[[ O " "J|,Dc#R#R S S S-DJJ X   O " "J|,D,?$A$A B B B O " "J|,D,4$6$6 7 7 7:DJJ S[[ O " "J|,D$=$?$? @ @ @DJJ O " "J|,Dd#S#S T T T:DJtr+c|j}|dkr<|jtdddd|_|j|_n&|j||j |_dS)Nrr_r.r:T) rrGr1rHrrscriptDataDoubleEscapeEndStater"rNrrs r*rz6HTMLTokenizer.scriptDataDoubleEscapedLessThanSignState>sz{!! 3;; O " "J|,Dc#R#R S S S#%D  \ ! ! O " "J|,Dd#S#S T T T  D ( K  d # # #:DJtr+c|j}|tvr"|jtdn|tvr0|jd|dg|j|_nT|dkr| n8|dkr|j |_n$|dvrW|j tddd |jd|dg|j|_n|d krW|j tdd d |jdd dg|j|_nl|tur5|j tdd d |j|_n.|jd|dg|j|_dS)NTr0r:rr)'"r]rXr-#invalid-character-in-attribute-namer.r{r|r?z#expected-attribute-name-but-got-eof)rrGr rr r$rHattributeNameStater"ryrr1rrr!rs r*rz&HTMLTokenizer.beforeAttributeNameStateYs{!! ? " " K " "?D 9 9 9 9 \ ! !  f % , ,dBZ 8 8 80DJJ S[[  ! ! # # # # S[[6DJJ ) ) ) O " "J|,D$I$K$K L L L  f % , ,dBZ 8 8 80DJJ X   O " "J|,D,?$A$A B B B  f % , ,h^ < < <0DJJ S[[ O " "J|,D$I$K$K L L LDJJ  f % , ,dBZ 8 8 80DJtr+c|j}d}d}|dkr|j|_n|tvrF|jdddxx||jtdzz cc<d}n8|dkrd}n.|tvr|j|_n|dkr|j |_n|d krL|j td d d |jdddxxd z cc<d}n|dvrL|j td dd |jdddxx|z cc<d}na|tur5|j td dd |j|_n#|jdddxx|z cc<d}|r|jdddt |jddd<|jdddD]L\}}|jddd|kr*|j td dd nM|r|dS)NTFr]r0rZrrrr{r-r|r.r?rrrXrzeof-in-attribute-namezduplicate-attribute)rrGbeforeAttributeValueStater"r r$rr afterAttributeNameStaterr1rHrrr!rtrry)r'r0leavingThisState emitTokenrp_s r*rz HTMLTokenizer.attributeNameStatews{!! 3;;7DJJ \ ! !  f %b )! , , , &&|T::1; ; , , ,$   S[[II _ $ $5DJJ S[[6DJJ X   O " "J|,D,?$A$A B B B  f %b )! , , , 8 , , ,$   _ $ $ O " "J|,D$I$K$K L L L  f %b )! , , , 4 , , ,$   S[[ O " "J|,D,C$E$E F F FDJJ  f %b )! , , , 4 , , ,$   ( !&)"-a0::;KLL  f %b )! ,,V4SbS9  a$V,R03t;;O**J|4L,A,C,CDDDE<  (%%'''tr+c|j}|tvr"|jtdn|dkr|j|_n|dkr|nq|tvr0|jd |dg|j |_n8|dkr|j |_n$|dkrW|j tdd d |jd d dg|j |_n|d vrW|j tdd d |jd |dg|j |_nl|tur5|j tddd |j|_n.|jd |dg|j |_dS)NTr]rr0r:rr{r-r|r.r?rz&invalid-character-after-attribute-namezexpected-end-of-tag-but-got-eof)rrGr rrr"ryr r$rHrrr1rrr!rs r*rz%HTMLTokenizer.afterAttributeNameStates{!! ? " " K " "?D 9 9 9 9 S[[7DJJ S[[  ! ! # # # # \ ! !  f % , ,dBZ 8 8 80DJJ S[[6DJJ X   O " "J|,D,?$A$A B B B  f % , ,h^ < < <0DJJ _ $ $ O " "J|,D$L$N$N O O O  f % , ,dBZ 8 8 80DJJ S[[ O " "J|,D$E$G$G H H HDJJ  f % , ,dBZ 8 8 80DJtr+c|j}|tvr"|jtdn|dkr|j|_n|dkr(|j|_|j|ny|dkr|j|_ne|dkr>|j tddd| n!|d krV|j tdd d|j d d d xxdz cc<|j|_n|dvrV|j tddd|j d d d xx|z cc<|j|_nk|tur5|j tddd|j|_n-|j d d d xx|z cc<|j|_dS)NTrrWrrr-z.expected-attribute-value-but-got-right-bracketr.r{r|r0rZr r?)r]rX`z"equals-in-unquoted-attribute-valuez$expected-attribute-value-but-got-eof)rrGr rattributeValueDoubleQuotedStater"attributeValueUnQuotedStaterNattributeValueSingleQuotedStater1rHrryr$rr!rs r*rz'HTMLTokenizer.beforeAttributeValueStatesG{!! ? " " K " "?D 9 9 9 9 T\\=DJJ S[[9DJ K  d # # # # S[[=DJJ S[[ O " "J|,D$T$V$V W W W  ! ! # # # # X   O " "J|,D,?$A$A B B B  f %b )! , , , 8 , , ,9DJJ _ $ $ O " "J|,D$H$J$J K K K  f %b )! , , , 4 , , ,9DJJ S[[ O " "J|,D$J$L$L M M MDJJ  f %b )! , , , 4 , , ,9DJtr+c*|j}|dkr |j|_n|dkr|dn|dkrJ|jtddd|jddd xxd z cc<nz|tur5|jtdd d|j |_n<|jddd xx||j d zz cc<d S)NrrWr{r-r|r.r0rZr r?z#eof-in-attribute-value-double-quote)rrWr{T rrGafterAttributeValueStater"rnr1rHrr$rr!rrs r*rz-HTMLTokenizer.attributeValueDoubleQuotedStatesD{!! 4<<6DJJ S[[  ) )# . . . . X   O " "J|,D,?$A$A B B B  f %b )! , , , 8 , , , , S[[ O " "J|,D$I$K$K L L LDJJ  f %b )! , , , &&'<==1> > , , ,tr+c*|j}|dkr |j|_n|dkr|dn|dkrJ|jtddd|jddd xxd z cc<nz|tur5|jtdd d|j |_n<|jddd xx||j d zz cc<d S)NrrWr{r-r|r.r0rZr r?z#eof-in-attribute-value-single-quote)rrWr{Trrs r*rz-HTMLTokenizer.attributeValueSingleQuotedStatesD{!! 3;;6DJJ S[[  ) )# . . . . X   O " "J|,D,?$A$A B B B  f %b )! , , , 8 , , , , S[[ O " "J|,D$I$K$K L L LDJJ  f %b )! , , , &&';<<1= = , , ,tr+c 2|j}|tvr|j|_nf|dkr|dnI|dkr|n-|dvrJ|jtddd|j ddd xx|z cc<n|d krJ|jtdd d|j ddd xxd z cc<n|tur5|jtdd d|j |_nQ|j ddd xx||j tdtzzz cc<dS)NrWr)rrr]rXrr-z0unexpected-character-in-unquoted-attribute-valuer.r0rZr r{r|r?z eof-in-attribute-value-no-quotes)rWrrrr]rXrr{T)rrGr rr"rnryr1rHrr$rr!rrKrs r*rz)HTMLTokenizer.attributeValueUnQuotedStates{!! ? " "6DJJ S[[  ) )# . . . . S[[  ! ! # # # # . . . O " "J|,D$V$X$X Y Y Y  f %b )! , , , 4 , , , , X   O " "J|,D,?$A$A B B B  f %b )! , , , 8 , , , , S[[ O " "J|,D$F$H$H I I IDJJ  f %b )! , , ,t{7M7MGHH?Z8\8\1\ \ , , ,tr+c |j}|tvr |j|_n|dkr|n|dkr |j|_n|turO|j tddd|j ||j |_nN|j tddd|j ||j|_dS)Nrrr-z$unexpected-EOF-after-attribute-valuer.z*unexpected-character-after-attribute-valueT) rrGr rr"ryrrr1rHrrNr!rs r*rz&HTMLTokenizer.afterAttributeValueState.s{!! ? " "6DJJ S[[  ! ! # # # # S[[6DJJ S[[ O " "J|,D$J$L$L M M M K  d # # #DJJ O " "J|,D$P$R$R S S S K  d # # #6DJtr+c|j}|dkrd|jd<|n|turO|jtddd|j||j |_ nN|jtddd|j||j |_ dS)NrTrsr-z#unexpected-EOF-after-solidus-in-tagr.z)unexpected-character-after-solidus-in-tag) rrGr$ryrr1rHrrNr!r"rrs r*rz&HTMLTokenizer.selfClosingStartTagStateBs{!! 3;;/3D m ,  ! ! # # # # S[[ O " "J|,D$I$K$K L L L K  d # # #DJJ O " "J|,D$O$Q$Q R R R K  d # # #6DJtr+c|jd}|dd}|jt d|d|j|j|_dS)Nrr{r?Commentr.T) rrreplacer1rHrrGr!r"rs r*rzHTMLTokenizer.bogusCommentStateTsz{%%c**||Hh//  *D 9 9 ; ; ; ^ tr+c|jg}|ddkr]||j|ddkr#tddd|_|j|_dSn|ddvrjd}dD]<}||j|d|vrd }n=|r&td ddddd |_|j|_dSn|dd kr|j|jj j r|jj j dj |jj j krSd}d D]>}||j|d|krd }n?|r|j |_dS|jtddd|r.|j||.|j|_dS)NrZrrr:r.T)dD))oOrSCtTyYpPeEFDoctype)r/rppublicIdsystemIdcorrect[)rrArrrr-zexpected-dashes-or-doctype)rrGrHrr$commentStartStater" doctypeStatertree openElements namespacedefaultNamespacecdataSectionStater1rNr3r)r'rRmatchedexpecteds r*rz(HTMLTokenizer.markupDeclarationOpenStatecsS[%%''( R=C     T[--// 0 0 0}##-7 -BB$O$O!!3 t$r]j ( (GA    !1!1!3!3444R=00#GE1 -7 -B-/15404%6%6!". t  ms""k%k+&k+B/9T[=M=^^^G:    !1!1!3!3444R=H,,#GE- !3 t  <(@ < > > ? ? ? / K  immoo . . . /+ tr+c|j}|dkr|j|_n|dkr>|jt ddd|jdxxdz cc<n|dkrT|jt dd d|j|j|j|_n~|turT|jt dd d|j|j|j|_n!|jdxx|z cc<|j |_d S) Nrr{r-r|r.r0r?rincorrect-commenteof-in-commentT) rrGcommentStartDashStater"r1rHrr$r!r commentStaters r*rzHTMLTokenizer.commentStartStatesn{!! 3;;3DJJ X   O " "J|,D,?$A$A B B B  f % % % 1 % % % % S[[ O " "J|,D$7$9$9 : : : O " "4#4 5 5 5DJJ S[[ O " "J|,D$4$6$6 7 7 7 O " "4#4 5 5 5DJJ  f % % % - % % %*DJtr+c|j}|dkr|j|_n|dkr>|jt ddd|jdxxdz cc<n|dkrT|jt dd d|j|j|j|_n|turT|jt dd d|j|j|j|_n$|jdxxd|zz cc<|j |_d S) Nrr{r-r|r.r0-�rrrT) rrGcommentEndStater"r1rHrr$r!rrrs r*rz#HTMLTokenizer.commentStartDashStatesr{!! 3;;-DJJ X   O " "J|,D,?$A$A B B B  f % % % 2 % % % % S[[ O " "J|,D$7$9$9 : : : O " "4#4 5 5 5DJJ S[[ O " "J|,D$4$6$6 7 7 7 O " "4#4 5 5 5DJJ  f % % %t 3 % % %*DJtr+c|j}|dkr |j|_n|dkr>|jt ddd|jdxxdz cc<n|turT|jt ddd|j|j|j |_n0|jdxx||j d zz cc<d S) Nrr{r-r|r.r0r?r)rr{T) rrGcommentEndDashStater"r1rHrr$rr!rrs r*rzHTMLTokenizer.commentStates#{!! 3;;1DJJ X   O " "J|,D,?$A$A B B B  f % % % 1 % % % % S[[ O " "J|,D,<$>$> ? ? ? O " "4#4 5 5 5DJJ  f % % % &&77*8 8 % % %tr+c|j}|dkr |j|_n|dkrJ|jt ddd|jdxxdz cc<|j|_n|turT|jt ddd|j|j|j |_n$|jdxxd|zz cc<|j|_d S) Nrr{r-r|r.r0r!zeof-in-comment-end-dashT) rrGr"r"r1rHrr$rrr!rs r*r$z!HTMLTokenizer.commentEndDashStates#{!! 3;;-DJJ X   O " "J|,D,?$A$A B B B  f % % % 2 % % %*DJJ S[[ O " "J|,D$=$?$? @ @ @ O " "4#4 5 5 5DJJ  f % % %t 3 % % %*DJtr+c|j}|dkr-|j|j|j|_ny|dkrK|jtddd|jdxxdz cc<|j|_n(|dkr5|jtdd d|j |_n|d kr>|jtdd d|jdxx|z cc<n|turT|jtdd d|j|j|j|_nL|jtdd d|jdxxd|zz cc<|j|_dS)Nrr{r-r|r.r0u--�rz,unexpected-bang-after-double-dash-in-commentrz,unexpected-dash-after-double-dash-in-commentzeof-in-comment-double-dashzunexpected-char-in-commentz--T) rrGr1rHr$r!r"rrcommentEndBangStaterrs r*r"zHTMLTokenizer.commentEndStates{!! 3;; O " "4#4 5 5 5DJJ X   O " "J|,D,?$A$A B B B  f % % % 3 % % %*DJJ S[[ O " "J|,D$R$T$T U U U1DJJ S[[ O " "J|,D$R$T$T U U U  f % % % - % % % % S[[ O " "J|,D$@$B$B C C C O " "4#4 5 5 5DJJ O " "J|,D$@$B$B C C C  f % % % 4 % % %*DJtr+c|j}|dkr,|j|j|j|_n|dkr"|jdxxdz cc<|j|_n|dkrJ|jtddd|jdxxd z cc<|j |_n|turT|jtdd d|j|j|j|_n$|jdxxd|zz cc<|j |_d S) Nrrr0z--!r{r-r|r.u--!�zeof-in-comment-end-bang-stateT) rrGr1rHr$r!r"r$rrrrs r*r'z!HTMLTokenizer.commentEndBangStatesq{!! 3;; O " "4#4 5 5 5DJJ S[[  f % % % . % % %1DJJ X   O " "J|,D,?$A$A B B B  f % % % 4 % % %*DJJ S[[ O " "J|,D$C$E$E F F F O " "4#4 5 5 5DJJ  f % % % 5 % % %*DJtr+c|j}|tvr |j|_n|t ur^|jtdddd|j d<|j|j |j |_nN|jtddd|j ||j|_dS)Nr-!expected-doctype-name-but-got-eofr.Frzneed-space-after-doctypeT) rrGr beforeDoctypeNameStater"rr1rHrr$r!rNrs r*rzHTMLTokenizer.doctypeStates{!! ? " "4DJJ S[[ O " "J|,D$G$I$I J J J+0D i ( O " "4#4 5 5 5DJJ O " "J|,D$>$@$@ A A A K  d # # #4DJtr+c|j}|tvrn&|dkr^|jt dddd|jd<|j|j|j|_n|dkr?|jt dddd |jd <|j |_n}|tur^|jt dd dd|jd<|j|j|j|_n||jd <|j |_d S) Nrr-z+expected-doctype-name-but-got-right-bracketr.Frr{r|r?rpr*T) rrGr r1rHrr$r!r"doctypeNameStaterrs r*r+z$HTMLTokenizer.beforeDoctypeNameState*sp{!! ? " "  S[[ O " "J|,D$Q$S$S T T T+0D i ( O " "4#4 5 5 5DJJ X   O " "J|,D,?$A$A B B B(0D f %.DJJ S[[ O " "J|,D$G$I$I J J J+0D i ( O " "4#4 5 5 5DJJ(,D f %.DJtr+cp|j}|tvr;|jdt |jd<|j|_nX|dkrY|jdt |jd<|j |j|j |_n|dkrJ|j tddd|jdxxdz cc<|j |_n|tur|j tdddd |jd <|jdt |jd<|j |j|j |_n|jdxx|z cc<d S) Nrprr{r-r|r.r?zeof-in-doctype-nameFrT)rrGr r$rtrafterDoctypeNameStater"r1rHr!rr-rrs r*r-zHTMLTokenizer.doctypeNameStateDs{!! ? " "(,(9&(A(K(KL\(](]D f %3DJJ S[[(,(9&(A(K(KL\(](]D f % O " "4#4 5 5 5DJJ X   O " "J|,D,?$A$A B B B  f % % % 1 % % %.DJJ S[[ O " "J|,D$9$;$; < < <+0D i ((,(9&(A(K(KL\(](]D f % O " "4#4 5 5 5DJJ  f % % % - % % %tr+c^|j}|tvrn|dkr-|j|j|j|_nU|turxd|jd<|j ||jtddd|j|j|j|_n|dvr9d}d D]#}|j}||vrd}n$|r|j |_dSn<|d vr8d}d D]#}|j}||vrd}n$|r|j |_dS|j ||jtdd d |idd|jd<|j |_dS)NrFrr-eof-in-doctyper.rT))uU)bB)lL)iIrsS)rr:rr )mMz*expected-space-or-right-bracket-in-doctyper0r<)rrGr r1rHr$r!r"rrNrafterDoctypePublicKeywordStateafterDoctypeSystemKeywordStatebogusDoctypeState)r'r0rrs r*r/z#HTMLTokenizer.afterDoctypeNameState]s{!! ? " "  S[[ O " "4#4 5 5 5DJJ S[[+0D i ( K  d # # # O " "J|,D$4$6$6 7 7 7 O " "4#4 5 5 5DJJz!!!9H;++--D8++"', !%!DDJ4 ##!9H;++--D8++"', !%!DDJ4 K  d # # # O " "J|,D$P%+TN$4$4 5 5 5,1D i (/DJtr+c$|j}|tvr |j|_n|dvrO|jtddd|j||j|_n|tur^|jtdddd|j d<|j|j |j |_n&|j||j|_dS N)rrr-unexpected-char-in-doctyper.r1FrT) rrGr "beforeDoctypePublicIdentifierStater"r1rHrrNrr$r!rs r*r?z,HTMLTokenizer.afterDoctypePublicKeywordState{!! ? " "@DJJ Z   O " "J|,D$@$B$B C C C K  d # # #@DJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ K  d # # #@DJtr+c|j}|tvrnE|dkrd|jd<|j|_n'|dkrd|jd<|j|_n |dkr^|jtdddd |jd <|j|j|j |_n|tur^|jtdd dd |jd <|j|j|j |_n>|jtdd dd |jd <|j |_d S)Nrr:r rrr-unexpected-end-of-doctyper.Frr1rDT) rrGr r$(doctypePublicIdentifierDoubleQuotedStater"(doctypePublicIdentifierSingleQuotedStater1rHrr!rrArs r*rEz0HTMLTokenizer.beforeDoctypePublicIdentifierStates{!! ? " "  T\\,.D j )FDJJ S[[,.D j )FDJJ S[[ O " "J|,D$?$A$A B B B+0D i ( O " "4#4 5 5 5DJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ O " "J|,D$@$B$B C C C+0D i (/DJtr+c|j}|dkr|j|_n$|dkr>|jt ddd|jdxxdz cc<n|dkr^|jt dd dd |jd <|j|j|j|_n||tur^|jt dd dd |jd <|j|j|j|_n|jdxx|z cc<d S)Nrr{r-r|r.r r?rrHFrr1T rrG!afterDoctypePublicIdentifierStater"r1rHrr$r!rrs r*rIz6HTMLTokenizer.doctypePublicIdentifierDoubleQuotedState{!! 4<<?DJJ X   O " "J|,D,?$A$A B B B  j ) ) )X 5 ) ) ) ) S[[ O " "J|,D$?$A$A B B B+0D i ( O " "4#4 5 5 5DJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ  j ) ) )T 1 ) ) )tr+c|j}|dkr|j|_n$|dkr>|jt ddd|jdxxdz cc<n|dkr^|jt dd dd |jd <|j|j|j|_n||tur^|jt dd dd |jd <|j|j|j|_n|jdxx|z cc<d S)Nrr{r-r|r.r r?rrHFrr1TrLrs r*rJz6HTMLTokenizer.doctypePublicIdentifierSingleQuotedState{!! 3;;?DJJ X   O " "J|,D,?$A$A B B B  j ) ) )X 5 ) ) ) ) S[[ O " "J|,D$?$A$A B B B+0D i ( O " "4#4 5 5 5DJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ  j ) ) )T 1 ) ) )tr+c*|j}|tvr|j|_nb|dkr-|j|j|j|_n/|dkr?|jtdddd|jd<|j |_n|dkr?|jtdddd|jd<|j |_n|tur^|jtdd dd |jd <|j|j|j|_n>|jtdddd |jd <|j |_d S) Nrrr-rDr.r:rrr1FrT)rrGr -betweenDoctypePublicAndSystemIdentifiersStater"r1rHr$r!r(doctypeSystemIdentifierDoubleQuotedState(doctypeSystemIdentifierSingleQuotedStaterrArs r*rMz/HTMLTokenizer.afterDoctypePublicIdentifierStates{!! ? " "KDJJ S[[ O " "4#4 5 5 5DJJ S[[ O " "J|,D$@$B$B C C C,.D j )FDJJ S[[ O " "J|,D$@$B$B C C C,.D j )FDJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ O " "J|,D$@$B$B C C C+0D i (/DJtr+ct|j}|tvrn|dkr,|j|j|j|_n|dkrd|jd<|j|_n|dkrd|jd<|j |_n|tkr^|jtdddd |jd <|j|j|j|_n>|jtdd dd |jd <|j |_d S) Nrrr:rrr-r1r.FrrDT) rrGr r1rHr$r!r"rSrTrrrArs r*rRz;HTMLTokenizer.betweenDoctypePublicAndSystemIdentifiersStatesK{!! ? " "  S[[ O " "4#4 5 5 5DJJ S[[,.D j )FDJJ S[[,.D j )FDJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ O " "J|,D$@$B$B C C C+0D i (/DJtr+c$|j}|tvr |j|_n|dvrO|jtddd|j||j|_n|tur^|jtdddd|j d<|j|j |j |_n&|j||j|_dSrC) rrGr "beforeDoctypeSystemIdentifierStater"r1rHrrNrr$r!rs r*r@z,HTMLTokenizer.afterDoctypeSystemKeywordState)rFr+c|j}|tvrnE|dkrd|jd<|j|_n'|dkrd|jd<|j|_n |dkr^|jtdddd |jd <|j|j|j |_n|tur^|jtdd dd |jd <|j|j|j |_n>|jtdddd |jd <|j |_d S) Nrr:rrrr-rDr.Frr1T) rrGr r$rSr"rTr1rHrr!rrArs r*rWz0HTMLTokenizer.beforeDoctypeSystemIdentifierState=s{!! ? " "  T\\,.D j )FDJJ S[[,.D j )FDJJ S[[ O " "J|,D$@$B$B C C C+0D i ( O " "4#4 5 5 5DJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ O " "J|,D$@$B$B C C C+0D i (/DJtr+c|j}|dkr|j|_n$|dkr>|jt ddd|jdxxdz cc<n|dkr^|jt dd dd |jd <|j|j|j|_n||tur^|jt dd dd |jd <|j|j|j|_n|jdxx|z cc<d S)Nrr{r-r|r.rr?rrHFrr1T rrG!afterDoctypeSystemIdentifierStater"r1rHrr$r!rrs r*rSz6HTMLTokenizer.doctypeSystemIdentifierDoubleQuotedStateZrNr+c|j}|dkr|j|_n$|dkr>|jt ddd|jdxxdz cc<n|dkr^|jt dd dd |jd <|j|j|j|_n||tur^|jt dd dd |jd <|j|j|j|_n|jdxx|z cc<d S)Nrr{r-r|r.rr?rrHFrr1TrZrs r*rTz6HTMLTokenizer.doctypeSystemIdentifierSingleQuotedStaterrPr+c|j}|tvrn|dkr,|j|j|j|_n|tur^|jtdddd|jd<|j|j|j|_n4|jtddd|j |_dS) Nrr-r1r.FrrDT) rrGr r1rHr$r!r"rrrArs r*r[z/HTMLTokenizer.afterDoctypeSystemIdentifierStates{!! ? " "  S[[ O " "4#4 5 5 5DJJ S[[ O " "J|,D$4$6$6 7 7 7+0D i ( O " "4#4 5 5 5DJJ O " "J|,D$@$B$B C C C/DJtr+c<|j}|dkr,|j|j|j|_nP|turF|j||j|j|j|_n dS)NrT) rrGr1rHr$r!r"rrNrs r*rAzHTMLTokenizer.bogusDoctypeStates{!! 3;; O " "4#4 5 5 5DJJ S[[ K  d # # # O " "4#4 5 5 5DJJ tr+cg} ||jd||jd|j}|tkrnF|dksJ|ddddkr|ddd|d<n||d|}|d}|d krPt|D]*}|jtd d d +| dd }|r(|jtd|d |j |_ dS)NT]rrZz]]r:r{rr-r|r.r?r_) rHrrrGrrJcountranger1rrr!r")r'r0rG nullCountrs r*rzHTMLTokenizer.cdataSectionStates & KK ..s33 4 4 4 KK ..s33 4 4 4;##%%Ds{{s{{{{8BCC=D((#Bx}DHKK%%% &wwt}}JJx(( q==9%% F F&& <0H0C(E(EFFFF<<(33D  3 O " "J|,D,0$2$2 3 3 3^ tr+)Nr)N__name__ __module__ __qualname____doc__r&r6rUrlrnryr!r~rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr$r"r'rr+r-r/r?rErIrJrMrRr@rWrSrTr[rAr __classcell__)r)s@r*rrs   . . . . . .000 FFFPNTNTNTNT`HHH $$$8: : $$   !!!F0,      8      8      8((,      8 *.2    <444l@   D&&2($   +++Z..$&>."42111f(:00<4(:00&   r+rN) __future__rrrpip._vendor.sixrrL collectionsrrsysr constantsr r r rrrrrrr _inputstreamr_trierr`dictruobjectrr+r*rtsiBBBBBBBBBB))))))********&&&&&&55555555----------00000000,,,,,,))))))tH~~ 6LLLlllllFlllllr+