Syntax::Highlight::Engine::Kate(3) a port to Perl of the syntax highlight engine of the Kate text editor.

SYNOPSIS


#if you want to create a compiled executable, you may want to do this:
use Syntax::Highlight::Engine::Kate::All;
use Syntax::Highlight::Engine::Kate;
my $hl = Syntax::Highlight::Engine::Kate->new(
language => 'Perl',
substitutions => {
"<" => "&lt;",
">" => "&gt;",
"&" => "&amp;",
" " => "&nbsp;",
"\t" => "&nbsp;&nbsp;&nbsp;",
"\n" => "<BR>\n",
},
format_table => {
Alert => [ "<font color=\"#0000ff\">", "</font>" ],
BaseN => [ "<font color=\"#007f00\">", "</font>" ],
BString => [ "<font color=\"#c9a7ff\">", "</font>" ],
Char => [ "<font color=\"#ff00ff\">", "</font>" ],
Comment => [ "<font color=\"#7f7f7f\"><i>", "</i></font>" ],
DataType => [ "<font color=\"#0000ff\">", "</font>" ],
DecVal => [ "<font color=\"#00007f\">", "</font>" ],
Error => [ "<font color=\"#ff0000\"><b><i>", "</i></b></font>" ],
Float => [ "<font color=\"#00007f\">", "</font>" ],
Function => [ "<font color=\"#007f00\">", "</font>" ],
IString => [ "<font color=\"#ff0000\">", "" ],
Keyword => [ "<b>", "</b>" ],
Normal => [ "", "" ],
Operator => [ "<font color=\"#ffa500\">", "</font>" ],
Others => [ "<font color=\"#b03060\">", "</font>" ],
RegionMarker => [ "<font color=\"#96b9ff\"><i>", "</i></font>" ],
Reserved => [ "<font color=\"#9b30ff\"><b>", "</b></font>" ],
String => [ "<font color=\"#ff0000\">", "</font>" ],
Variable => [ "<font color=\"#0000ff\"><b>", "</b></font>" ],
Warning => [ "<font color=\"#0000ff\"><b><i>", "</b></i></font>" ],
},
);
#or
my $hl = Syntax::Highlight::Engine::Kate::Perl->new(
substitutions => {
# ...
},
format_table => {
# ...
},
);
print "<html>\n<head>\n</head>\n<body>\n";
while ( my $in = <> ) {
print $hl->highlightText($in);
}
print "</body>\n</html>\n";

DESCRIPTION

Syntax::Highlight::Engine::Kate is a port to Perl of the syntax highlight engine of the Kate text editor.

The language XML files of Kate have been rewritten to Perl modules using a script. These modules function as plugins to this module.

Syntax::Highlight::Engine::Kate inherits Syntax::Highlight::Engine::Kate::Template.

OPTIONS

language
Specify the language you want highlighted. Look in the PLUGINS section for supported languages.
plugins
If you created your own language plugins you may specify a list of them with this option.

 plugins => [
   ["MyModuleName", "MyLanguageName", "*,ext1;*.ext2", "Section"],
   ....
 ]
format_table
This option must be specified if the highlightText method needs to do anything useful for you. All mentioned keys in the synopsis must be specified.
substitutions
With this option you can specify additional formatting options.

METHODS

extensions
Returns a reference to the extensions hash.
language(?$language?)
Sets and returns the current language that is highlighted. When setting the language a reset is also done.
languageAutoSet($filename)
Suggests language name for the given file $filename.
languageList
Returns a list of languages for which plugins have been defined.
languagePlug($language, ?$insensitive?)
Returns the module name of the plugin for $language.

If $insensitive is set it will also try to match names ignoring case and return the correct module name of the plugin.

e.g. "$highlighter->languagePlug('HtMl', 1);" will return "HTML".

languagePropose($filename)
Suggests language name for the given file $filename.
sections
Returns a reference to the sections hash.

ATTRIBUTES

In the Kate XML syntax files you find under the section <itemDatas> entries like "<itemData name="Unknown Property" defStyleNum="dsError" italic="1"/>". Kate is an editor so it is ok to have definitions for foreground and background colors and so on. However, since this module is supposed to be a more universal highlight engine, the attributes need to be fully abstract. In which case, Kate does not have enough default attributes defined to fulfill all needs. Kate defines the following standard attributes:
  • dsNormal
  • dsKeyword
  • dsDataType
  • dsDecVal
  • dsBaseN
  • dsFloat
  • dsChar
  • dsString
  • dsComment
  • dsOthers
  • dsAlert
  • dsFunction
  • dsRegionMarker
  • dsError

This module leaves out the <ds> part and uses following additional attributes:

  • BString
  • IString
  • Operator
  • Reserved
  • Variable

I have modified the XML files so that each highlight mode would get its own attribute. In quite a few cases still not enough attributes were defined. So in some languages different modes have the same attribute.

PLUGINS

Below is an overview of existing plugins. All have been tested on use and can be created. The ones for which no sample file is available are marked. Those marked OK have highlighted the test file without apparent mistakes. This does not mean that all bugs are shaken out.

 LANGUAGE             MODULE                   COMMENT
 ********             ******                   ******
 .desktop             Desktop                  OK
 4GL                  FourGL                   No sample file
 4GL-PER              FourGLminusPER           No sample file
 ABC                  ABC                      OK
 AHDL                 AHDL                     OK
 ANSI C89             ANSI_C89                 No sample file
 ASP                  ASP                      OK
 AVR Assembler        AVR_Assembler            OK
 AWK                  AWK                      OK
 Ada                  Ada                      No sample file
                      Alerts                   OK hidden module
 Ansys                Ansys                    No sample file
 Apache Configuration Apache_Configuration     No sample file
 Asm6502              Asm6502                  No sample file
 Bash                 Bash                     OK
 BibTeX               BibTeX                   OK
 C                    C                        No sample file
 C#                   Cdash                    No sample file
 C++                  Cplusplus                OK
 CGiS                 CGiS                     No sample file
 CMake                CMake                    OK
 CSS                  CSS                      OK
 CUE Sheet            CUE_Sheet                No sample file
 Cg                   Cg                       No sample file
 ChangeLog            ChangeLog                No sample file
 Cisco                Cisco                    No sample file
 Clipper              Clipper                  OK
 ColdFusion           ColdFusion               No sample file
 Common Lisp          Common_Lisp              OK
 Component-Pascal     ComponentminusPascal     No sample file
 D                    D                        No sample file
 Debian Changelog     Debian_Changelog         No sample file
 Debian Control       Debian_Control           No sample file
 Diff                 Diff                     No sample file
 Doxygen              Doxygen                  OK
 E Language           E_Language               OK
 Eiffel               Eiffel                   No sample file
 Email                Email                    OK
 Euphoria             Euphoria                 OK
 Fortran              Fortran                  OK
 FreeBASIC            FreeBASIC                No sample file
 GDL                  GDL                      No sample file
 GLSL                 GLSL                     OK
 GNU Assembler        GNU_Assembler            No sample file
 GNU Gettext          GNU_Gettext              No sample file
 HTML                 HTML                     OK
 Haskell              Haskell                  OK
 IDL                  IDL                      No sample file
 ILERPG               ILERPG                   No sample file
 INI Files            INI_Files                No sample file
 Inform               Inform                   No sample file
 Intel x86 (NASM)     Intel_X86_NASM           seems to have issues
 JSP                  JSP                      OK
 Java                 Java                     OK
 JavaScript           JavaScript               OK
 Javadoc              Javadoc                  No sample file
 KBasic               KBasic                   No sample file
 Kate File Template   Kate_File_Template       No sample file
 LDIF                 LDIF                     No sample file
 LPC                  LPC                      No sample file
 LaTeX                LaTex                    OK
 Lex/Flex             Lex_Flex                 OK
 LilyPond             LilyPond                 OK
 Literate Haskell     Literate_Haskell         OK
 Lua                  Lua                      No sample file
 M3U                  M3U                      OK
 MAB-DB               MABminusDB               No sample file
 MIPS Assembler       MIPS_Assembler           No sample file
 Makefile             Makefile                 No sample file
 Mason                Mason                    No sample file
 Matlab               Matlab                   has issues
 Modula-2             Modulaminus2             No sample file
 Music Publisher      Music_Publisher          No sample file
 Octave               Octave                   OK
 PHP (HTML)           PHP_HTML                 OK
                      PHP_PHP                  OK hidden module
 POV-Ray              POV_Ray                  OK
 Pascal               Pascal                   No sample file
 Perl                 Perl                     OK
 PicAsm               PicAsm                   OK
 Pike                 Pike                     OK
 PostScript           PostScript               OK
 Prolog               Prolog                   No sample file
 PureBasic            PureBasic                OK
 Python               Python                   OK
 Quake Script         Quake_Script             No sample file
 R Script             R_Script                 No sample file
 REXX                 REXX                     No sample file
 RPM Spec             RPM_Spec                 No sample file
 RSI IDL              RSI_IDL                  No sample file
 RenderMan RIB        RenderMan_RIB            OK
 Ruby                 Ruby                     OK
 SGML                 SGML                     No sample file
 SML                  SML                      No sample file
 SQL                  SQL                      No sample file
 SQL (MySQL)          SQL_MySQL                No sample file
 SQL (PostgreSQL)     SQL_PostgreSQL           No sample file
 Sather               Sather                   No sample file
 Scheme               Scheme                   OK
 Sieve                Sieve                    No sample file
 Spice                Spice                    OK
 Stata                Stata                    OK
 TI Basic             TI_Basic                 No sample file
 TaskJuggler          TaskJuggler              No sample file
 Tcl/Tk               TCL_Tk                   OK
 UnrealScript         UnrealScript             OK
 VHDL                 VHDL                     No sample file
 VRML                 VRML                     OK
 Velocity             Velocity                 No sample file
 Verilog              Verilog                  No sample file
 WINE Config          WINE_Config              No sample file
 Wikimedia            Wikimedia                No sample file
 XML                  XML                      OK
 XML (Debug)          XML_Debug                No sample file
 Yacc/Bison           Yacc_Bison               OK
 de_DE                De_DE                    No sample file
 en_EN                En_EN                    No sample file
 ferite               Ferite                   No sample file
 nl                   Nl                       No sample file
 progress             Progress                 No sample file
 scilab               Scilab                   No sample file
 txt2tags             Txt2tags                 No sample file
 x.org Configuration  X_org_Configuration      OK
 xHarbour             XHarbour                 OK
 xslt                 Xslt                     No sample file
 yacas                Yacas                    No sample file

BUGS

Float is detected differently than in the Kate editor.

The regular expression engine of the Kate editor, qregexp, appears to be more tolerant to mistakes in regular expressions than Perl. This might lead to error messages and differences in behaviour. Most of the problems were sorted out while developing, because error messages appeared. For as far as differences in behaviour is concerned, testing is the only way to find out, so I hope the users out there will be able to tell me more.

This module is mimicking the behaviour of the syntax highlight engine of the Kate editor. If you find a bug/mistake in the highlighting, please check if Kate behaves in the same way. If yes, the cause is likely to be found there.

TO DO

Rebuild the scripts I am using to generate the modules from XML files so they are more pro-actively tracking flaws in the build of the XML files like missing lists. Also regular expressions in the XML can be tested better before they are used in plugins.

Refine the test methods in Syntax::Highlight::Engine::Kate::Template, so that choices for case sensitivity, dynamic behaviour and lookahead can be determined at generate time of the plugin, might increase throughput.

Implement code folding.

ACKNOWLEDGEMENTS

All the people who wrote Kate and the syntax highlight XML files.

AUTHOR AND COPYRIGHT

This module is written and maintained by:

Hans Jeuken < haje at toneel dot demon dot nl >

Copyright (c) 2006 by Hans Jeuken, all rights reserved.

You may freely distribute and/or modify this module under the same terms as Perl itself.