1PHP Markdown Extra
2==================
3
4Version 1.2.3 - Wed 31 Dec 2008
5
6by Michel Fortin
7<http://www.michelf.com/>
8
9based on Markdown by John Gruber  
10<http://daringfireball.net/>
11
12
13Introduction
14------------
15
16This is a special version of PHP Markdown with extra features. See
17<http://www.michelf.com/projects/php-markdown/extra/> for details.
18
19Markdown is a text-to-HTML conversion tool for web writers. Markdown
20allows you to write using an easy-to-read, easy-to-write plain text
21format, then convert it to structurally valid XHTML (or HTML).
22
23"Markdown" is two things: a plain text markup syntax, and a software 
24tool, written in Perl, that converts the plain text markup to HTML. 
25PHP Markdown is a port to PHP of the original Markdown program by 
26John Gruber.
27
28PHP Markdown can work as a plug-in for WordPress and bBlog, as a 
29modifier for the Smarty templating engine, or as a remplacement for
30textile formatting in any software that support textile.
31
32Full documentation of Markdown's syntax is available on John's 
33Markdown page: <http://daringfireball.net/projects/markdown/>
34
35
36Installation and Requirement
37----------------------------
38
39PHP Markdown requires PHP version 4.0.5 or later.
40
41
42### WordPress ###
43
44PHP Markdown works with [WordPress][wp], version 1.2 or later.
45
46 [wp]: http://wordpress.org/
47
481.  To use PHP Markdown with WordPress, place the "makrdown.php" file 
49    in the "plugins" folder. This folder is located inside 
50    "wp-content" at the root of your site:
51
52        (site home)/wp-content/plugins/
53
542.  Activate the plugin with the administrative interface of 
55    WordPress. In the "Plugins" section you will now find Markdown. 
56    To activate the plugin, click on the "Activate" button on the 
57    same line than Markdown. Your entries will now be formatted by 
58    PHP Markdown.
59
603.  To post Markdown content, you'll first have to disable the 
61	"visual" editor in the User section of WordPress.
62
63You can configure PHP Markdown to not apply to the comments on your 
64WordPress weblog. See the "Configuration" section below.
65
66It is not possible at this time to apply a different set of 
67filters to different entries. All your entries will be formated by 
68PHP Markdown. This is a limitation of WordPress. If your old entries 
69are written in HTML (as opposed to another formatting syntax, like 
70Textile), they'll probably stay fine after installing Markdown.
71
72
73### bBlog ###
74
75PHP Markdown also works with [bBlog][bb].
76
77 [bb]: http://www.bblog.com/
78
79To use PHP Markdown with bBlog, rename "markdown.php" to 
80"modifier.markdown.php" and place the file in the "bBlog_plugins" 
81folder. This folder is located inside the "bblog" directory of 
82your site, like this:
83
84        (site home)/bblog/bBlog_plugins/modifier.markdown.php
85
86Select "Markdown" as the "Entry Modifier" when you post a new 
87entry. This setting will only apply to the entry you are editing.
88
89
90### Replacing Textile in TextPattern ###
91
92[TextPattern][tp] use [Textile][tx] to format your text. You can 
93replace Textile by Markdown in TextPattern without having to change
94any code by using the *Texitle Compatibility Mode*. This may work 
95with other software that expect Textile too.
96
97 [tx]: http://www.textism.com/tools/textile/
98 [tp]: http://www.textpattern.com/
99
1001.  Rename the "markdown.php" file to "classTextile.php". This will
101	make PHP Markdown behave as if it was the actual Textile parser.
102
1032.  Replace the "classTextile.php" file TextPattern installed in your
104	web directory. It can be found in the "lib" directory:
105
106		(site home)/textpattern/lib/
107
108Contrary to Textile, Markdown does not convert quotes to curly ones 
109and does not convert multiple hyphens (`--` and `---`) into en- and 
110em-dashes. If you use PHP Markdown in Textile Compatibility Mode, you 
111can solve this problem by installing the "smartypants.php" file from 
112[PHP SmartyPants][psp] beside the "classTextile.php" file. The Textile 
113Compatibility Mode function will use SmartyPants automatically without 
114further modification.
115
116 [psp]: http://www.michelf.com/projects/php-smartypants/
117
118
119### In Your Own Programs ###
120
121You can use PHP Markdown easily in your current PHP program. Simply 
122include the file and then call the Markdown function on the text you 
123want to convert:
124
125    include_once "markdown.php";
126    $my_html = Markdown($my_text);
127
128If you wish to use PHP Markdown with another text filter function 
129built to parse HTML, you should filter the text *after* the Markdown
130function call. This is an example with [PHP SmartyPants][psp]:
131
132    $my_html = SmartyPants(Markdown($my_text));
133
134
135### With Smarty ###
136
137If your program use the [Smarty][sm] template engine, PHP Markdown 
138can now be used as a modifier for your templates. Rename "markdown.php" 
139to "modifier.markdown.php" and put it in your smarty plugins folder.
140
141  [sm]: http://smarty.php.net/
142
143If you are using MovableType 3.1 or later, the Smarty plugin folder is 
144located at `(MT CGI root)/php/extlib/smarty/plugins`. This will allow 
145Markdown to work on dynamic pages.
146
147
148### Updating Markdown in Other Programs ###
149
150Many web applications now ship with PHP Markdown, or have plugins to 
151perform the conversion to HTML. You can update PHP Markdown -- or 
152replace it with PHP Markdown Extra -- in many of these programs by 
153swapping the old "markdown.php" file for the new one.
154
155Here is a short non-exhaustive list of some programs and where they 
156hide the "markdown.php" file.
157
158| Program   | Path to Markdown
159| -------   | ----------------
160| [Pivot][] | `(site home)/pivot/includes/markdown/`
161
162If you're unsure if you can do this with your application, ask the 
163developer, or wait for the developer to update his application or 
164plugin with the new version of PHP Markdown.
165
166 [Pivot]: http://pivotlog.net/
167
168
169Configuration
170-------------
171
172By default, PHP Markdown produces XHTML output for tags with empty 
173elements. E.g.:
174
175    <br />
176
177Markdown can be configured to produce HTML-style tags; e.g.:
178
179    <br>
180
181To do this, you  must edit the "MARKDOWN_EMPTY_ELEMENT_SUFFIX" 
182definition below the "Global default settings" header at the start of 
183the "markdown.php" file.
184
185
186### WordPress-Specific Settings ###
187
188By default, the Markdown plugin applies to both posts and comments on 
189your WordPress weblog. To deactivate one or the other, edit the 
190`MARKDOWN_WP_POSTS` or `MARKDOWN_WP_COMMENTS` definitions under the 
191"WordPress settings" header at the start of the "markdown.php" file.
192
193
194Bugs
195----
196
197To file bug reports please send email to:
198<michel.fortin@michelf.com>
199
200Please include with your report: (1) the example input; (2) the output you
201expected; (3) the output PHP Markdown actually produced.
202
203
204Version History
205---------------
206
207Extra 1.2.3 (31 Dec 2008):
208
209*	In WordPress pages featuring more than one post, footnote id prefixes are 
210	now automatically applied with the current post ID to avoid clashes
211	between footnotes belonging to different posts.
212
213*	Fix for a bug introduced in Extra 1.2 where block-level HTML tags where 
214	not detected correctly, thus the addition of erroneous `<p>` tags and
215	interpretation of their content as Markdown-formatted instead of
216	HTML-formatted.
217
218
219Extra 1.2.2 (21 Jun 2008):
220
221*	Fixed a problem where abbreviation definitions, footnote
222	definitions and link references were stripped inside
223	fenced code blocks.
224
225*	Fixed a bug where characters such as `"` in abbreviation
226	definitions weren't properly encoded to HTML entities.
227
228*	Fixed a bug where double quotes `"` were not correctly encoded
229	as HTML entities when used inside a footnote reference id.
230
231
2321.0.1m (21 Jun 2008):
233
234*	Lists can now have empty items.
235
236*	Rewrote the emphasis and strong emphasis parser to fix some issues
237	with odly placed and overlong markers.
238
239
240Extra 1.2.1 (27 May 2008):
241
242*	Fixed a problem where Markdown headers and horizontal rules were
243	transformed into their HTML equivalent inside fenced code blocks.
244
245
246Extra 1.2 (11 May 2008):
247
248*	Added fenced code block syntax which don't require indentation
249	and can start and end with blank lines. A fenced code block
250	starts with a line of consecutive tilde (~) and ends on the
251	next line with the same number of consecutive tilde. Here's an
252	example:
253	
254	    ~~~~~~~~~~~~
255		Hello World!
256		~~~~~~~~~~~~
257
258*	Rewrote parts of the HTML block parser to better accomodate
259	fenced code blocks.
260
261*	Footnotes may now be referenced from within another footnote.
262
263*	Added programatically-settable parser property `predef_attr` for 
264	predefined attribute definitions.
265
266*	Fixed an issue where an indented code block preceded by a blank
267	line containing some other whitespace would confuse the HTML 
268	block parser into creating an HTML block when it should have 
269	been code.
270
271
2721.0.1l (11 May 2008):
273
274*	Now removing the UTF-8 BOM at the start of a document, if present.
275
276*	Now accepting capitalized URI schemes (such as HTTP:) in automatic
277	links, such as `<HTTP://EXAMPLE.COM/>`.
278
279*	Fixed a problem where `<hr@example.com>` was seen as a horizontal
280	rule instead of an automatic link.
281
282*	Fixed an issue where some characters in Markdown-generated HTML
283	attributes weren't properly escaped with entities.
284
285*	Fix for code blocks as first element of a list item. Previously,
286	this didn't create any code block for item 2:
287	
288		*   Item 1 (regular paragraph)
289		
290		*       Item 2 (code block)
291
292*	A code block starting on the second line of a document wasn't seen
293	as a code block. This has been fixed.
294	
295*	Added programatically-settable parser properties `predef_urls` and 
296	`predef_titles` for predefined URLs and titles for reference-style 
297	links. To use this, your PHP code must call the parser this way:
298	
299		$parser = new Markdwon_Parser;
300		$parser->predef_urls = array('linkref' => 'http://example.com');
301		$html = $parser->transform($text);
302	
303	You can then use the URL as a normal link reference:
304	
305		[my link][linkref]	
306		[my link][linkRef]
307		
308	Reference names in the parser properties *must* be lowercase.
309	Reference names in the Markdown source may have any case.
310
311*	Added `setup` and `teardown` methods which can be used by subclassers
312	as hook points to arrange the state of some parser variables before and 
313	after parsing.
314
315
316Extra 1.1.7 (26 Sep 2007):
317
3181.0.1k (26 Sep 2007):
319
320*	Fixed a problem introduced in 1.0.1i where three or more identical
321	uppercase letters, as well as a few other symbols, would trigger
322	a horizontal line.
323
324
325Extra 1.1.6 (4 Sep 2007):
326
3271.0.1j (4 Sep 2007):
328
329*	Fixed a problem introduced in 1.0.1i where the closing `code` and 
330	`pre` tags at the end of a code block were appearing in the wrong 
331	order.
332
333*	Overriding configuration settings by defining constants from an 
334	external before markdown.php is included is now possible without 
335	producing a PHP warning.
336
337
338Extra 1.1.5 (31 Aug 2007):
339
3401.0.1i (31 Aug 2007):
341
342*	Fixed a problem where an escaped backslash before a code span 
343	would prevent the code span from being created. This should now
344	work as expected:
345	
346		Litteral backslash: \\`code span`
347
348*	Overall speed improvements, especially with long documents.
349
350
351Extra 1.1.4 (3 Aug 2007):
352
3531.0.1h (3 Aug 2007):
354
355*	Added two properties (`no_markup` and `no_entities`) to the parser 
356	allowing HTML tags and entities to be disabled.
357
358*	Fix for a problem introduced in 1.0.1g where posting comments in 
359	WordPress would trigger PHP warnings and cause some markup to be 
360	incorrectly filtered by the kses filter in WordPress.
361
362
363Extra 1.1.3 (3 Jul 2007):
364
365*	Fixed a performance problem when parsing some invalid HTML as an HTML 
366	block which was resulting in too much recusion and a segmentation fault 
367	for long documents.
368
369*	The markdown="" attribute now accepts unquoted values.
370
371*	Fixed an issue where underscore-emphasis didn't work when applied on the 
372	first or the last word of an element having the markdown="1" or 
373	markdown="span" attribute set unless there was some surrounding whitespace.
374	This didn't work:
375	
376		<p markdown="1">_Hello_ _world_</p>
377	
378	Now it does produce emphasis as expected.
379
380*	Fixed an issue preventing footnotes from working when the parser's 
381	footnote id prefix variable (fn_id_prefix) is not empty.
382
383*	Fixed a performance problem where the regular expression for strong 
384	emphasis introduced in version 1.1 could sometime be long to process, 
385	give slightly wrong results, and in some circumstances could remove 
386	entirely the content for a whole paragraph.
387
388*	Fixed an issue were abbreviations tags could be incorrectly added 
389	inside URLs and title of links.
390
391*	Placing footnote markers inside a link, resulting in two nested links, is 
392	no longer allowed.
393
394
3951.0.1g (3 Jul 2007):
396
397*	Fix for PHP 5 compiled without the mbstring module. Previous fix to 
398	calculate the length of UTF-8 strings in `detab` when `mb_strlen` is 
399	not available was only working with PHP 4.
400
401*	Fixed a problem with WordPress 2.x where full-content posts in RSS feeds 
402	were not processed correctly by Markdown.
403
404*	Now supports URLs containing literal parentheses for inline links 
405	and images, such as:
406
407		[WIMP](http://en.wikipedia.org/wiki/WIMP_(computing))
408
409	Such parentheses may be arbitrarily nested, but must be
410	balanced. Unbalenced parentheses are allowed however when the URL 
411	when escaped or when the URL is enclosed in angle brakets `<>`.
412
413*	Fixed a performance problem where the regular expression for strong 
414	emphasis introduced in version 1.0.1d could sometime be long to process, 
415	give slightly wrong results, and in some circumstances could remove 
416	entirely the content for a whole paragraph.
417
418*	Some change in version 1.0.1d made possible the incorrect nesting of 
419	anchors within each other. This is now fixed.
420
421*	Fixed a rare issue where certain MD5 hashes in the content could
422	be changed to their corresponding text. For instance, this:
423
424		The MD5 value for "+" is "26b17225b626fb9238849fd60eabdf60".
425	
426	was incorrectly changed to this in previous versions of PHP Markdown:
427
428		<p>The MD5 value for "+" is "+".</p>
429
430*	Now convert escaped characters to their numeric character 
431	references equivalent.
432	
433	This fix an integration issue with SmartyPants and backslash escapes. 
434	Since Markdown and SmartyPants have some escapable characters in common, 
435	it was sometime necessary to escape them twice. Previously, two 
436	backslashes were sometime required to prevent Markdown from "eating" the 
437	backslash before SmartyPants sees it:
438	
439		Here are two hyphens: \\--
440	
441	Now, only one backslash will do:
442	
443		Here are two hyphens: \--
444
445
446Extra 1.1.2 (7 Feb 2007)
447
448*	Fixed an issue where headers preceded too closely by a paragraph 
449	(with no blank line separating them) where put inside the paragraph.
450
451*	Added the missing TextileRestricted method that was added to regular
452	PHP Markdown since 1.0.1d but which I forgot to add to Extra.
453
454
4551.0.1f (7 Feb 2007):
456
457*	Fixed an issue with WordPress where manually-entered excerpts, but 
458	not the auto-generated ones, would contain nested paragraphs.
459
460*	Fixed an issue introduced in 1.0.1d where headers and blockquotes 
461	preceded too closely by a paragraph (not separated by a blank line) 
462	where incorrectly put inside the paragraph.
463
464*	Fixed an issue introduced in 1.0.1d in the tokenizeHTML method where 
465	two consecutive code spans would be merged into one when together they 
466	form a valid tag in a multiline paragraph.
467
468*	Fixed an long-prevailing issue where blank lines in code blocks would 
469	be doubled when the code block is in a list item.
470	
471	This was due to the list processing functions relying on artificially 
472	doubled blank lines to correctly determine when list items should 
473	contain block-level content. The list item processing model was thus 
474	changed to avoid the need for double blank lines.
475
476*	Fixed an issue with `<% asp-style %>` instructions used as inline 
477	content where the opening `<` was encoded as `&lt;`.
478
479*	Fixed a parse error occuring when PHP is configured to accept 
480	ASP-style delimiters as boundaries for PHP scripts.
481
482*	Fixed a bug introduced in 1.0.1d where underscores in automatic links
483	got swapped with emphasis tags.
484
485
486Extra 1.1.1 (28 Dec 2006)
487
488*	Fixed a problem where whitespace at the end of the line of an atx-style
489	header would cause tailing `#` to appear as part of the header's content.
490	This was caused by a small error in the regex that handles the definition
491	for the id attribute in PHP Markdown Extra.
492
493*	Fixed a problem where empty abbreviations definitions would eat the 
494	following line as its definition.
495
496*	Fixed an issue with calling the Markdown parser repetitivly with text 
497	containing footnotes. The footnote hashes were not reinitialized properly.
498
499
5001.0.1e (28 Dec 2006)
501
502*	Added support for internationalized domain names for email addresses in 
503	automatic link. Improved the speed at which email addresses are converted 
504	to entities. Thanks to Milian Wolff for his optimisations.
505
506*	Made deterministic the conversion to entities of email addresses in 
507	automatic links. This means that a given email address will always be 
508	encoded the same way.
509
510*	PHP Markdown will now use its own function to calculate the length of an 
511	UTF-8 string in `detab` when `mb_strlen` is not available instead of 
512	giving a fatal error.
513
514
515Extra 1.1 (1 Dec 2006)
516
517*	Added a syntax for footnotes.
518
519*	Added an experimental syntax to define abbreviations.
520
521
5221.0.1d (1 Dec 2006)
523
524*   Fixed a bug where inline images always had an empty title attribute. The 
525	title attribute is now present only when explicitly defined.
526
527*	Link references definitions can now have an empty title, previously if the 
528	title was defined but left empty the link definition was ignored. This can 
529	be useful if you want an empty title attribute in images to hide the 
530	tooltip in Internet Explorer.
531
532*	Made `detab` aware of UTF-8 characters. UTF-8 multi-byte sequences are now 
533	correctly mapped to one character instead of the number of bytes.
534
535*	Fixed a small bug with WordPress where WordPress' default filter `wpautop`
536	was not properly deactivated on comment text, resulting in hard line breaks
537	where Markdown do not prescribes them.
538
539*	Added a `TextileRestrited` method to the textile compatibility mode. There
540	is no restriction however, as Markdown does not have a restricted mode at 
541	this point. This should make PHP Markdown work again in the latest 
542	versions of TextPattern.
543
544*   Converted PHP Markdown to a object-oriented design.
545
546*	Changed span and block gamut methods so that they loop over a 
547	customizable list of methods. This makes subclassing the parser a more 
548	interesting option for creating syntax extensions.
549
550*	Also added a "document" gamut loop which can be used to hook document-level 
551	methods (like for striping link definitions).
552
553*	Changed all methods which were inserting HTML code so that they now return 
554	a hashed representation of the code. New methods `hashSpan` and `hashBlock`
555	are used to hash respectivly span- and block-level generated content. This 
556	has a couple of significant effects:
557	
558	1.	It prevents invalid nesting of Markdown-generated elements which 
559	    could occur occuring with constructs like `*something [link*][1]`.
560	2.	It prevents problems occuring with deeply nested lists on which 
561	    paragraphs were ill-formed.
562	3.	It removes the need to call `hashHTMLBlocks` twice during the the 
563		block gamut.
564	
565	Hashes are turned back to HTML prior output.
566
567*	Made the block-level HTML parser smarter using a specially-crafted regular 
568	expression capable of handling nested tags.
569
570*	Solved backtick issues in tag attributes by rewriting the HTML tokenizer to 
571	be aware of code spans. All these lines should work correctly now:
572	
573		<span attr='`ticks`'>bar</span>
574		<span attr='``double ticks``'>bar</span>
575		`<test a="` content of attribute `">`
576
577*	Changed the parsing of HTML comments to match simply from `<!--` to `-->` 
578	instead using of the more complicated SGML-style rule with paired `--`.
579	This is how most browsers parse comments and how XML defines them too.
580
581*	`<address>` has been added to the list of block-level elements and is now
582	treated as an HTML block instead of being wrapped within paragraph tags.
583
584*	Now only trim trailing newlines from code blocks, instead of trimming
585	all trailing whitespace characters.
586
587*	Fixed bug where this:
588
589		[text](http://m.com "title" )
590		
591	wasn't working as expected, because the parser wasn't allowing for spaces
592	before the closing paren.
593
594*	Filthy hack to support markdown='1' in div tags.
595
596*	_DoAutoLinks() now supports the 'dict://' URL scheme.
597
598*	PHP- and ASP-style processor instructions are now protected as
599	raw HTML blocks.
600
601		<? ... ?>
602		<% ... %>
603
604*	Fix for escaped backticks still triggering code spans:
605
606		There are two raw backticks here: \` and here: \`, not a code span
607
608
609Extra 1.0 - 5 September 2005
610
611*   Added support for setting the id attributes for headers like this:
612	
613        Header 1            {#header1}
614        ========
615	
616        ## Header 2 ##      {#header2}
617	
618    This only work only for headers for now.
619
620*   Tables will now work correctly as the first element of a definition 
621    list. For example, this input:
622
623        Term
624
625        :   Header  | Header
626            ------- | -------
627            Cell    | Cell
628		    
629    used to produce no definition list and a table where the first 
630    header was named ": Header". This is now fixed.
631
632*   Fix for a problem where a paragraph following a table was not 
633    placed between `<p>` tags.
634
635
636Extra 1.0b4 - 1 August 2005
637
638*   Fixed some issues where whitespace around HTML blocks were trigging
639    empty paragraph tags.
640
641*   Fixed an HTML block parsing issue that would cause a block element 
642    following a code span or block with unmatched opening bracket to be
643    placed inside a paragraph.
644
645*   Removed some PHP notices that could appear when parsing definition
646    lists and tables with PHP notice reporting flag set.
647
648
649Extra 1.0b3 - 29 July 2005
650
651*   Definition lists now require a blank line before each term. Solves
652    an ambiguity where the last line of lazy-indented definitions could 
653    be mistaken by PHP Markdown as a new term in the list.
654
655*   Definition lists now support multiple terms per definition.
656
657*   Some special tags were replaced in the output by their md5 hash 
658    key. Things such as this now work as expected:
659	
660        ## Header <?php echo $number ?> ##
661
662
663Extra 1.0b2 - 26 July 2005
664
665*   Definition lists can now take two or more definitions for one term.
666    This should have been the case before, but a bug prevented this 
667    from working right.
668
669*   Fixed a problem where single column table with a pipe only at the
670    end where not parsed as table. Here is such a table:
671	
672        | header
673        | ------
674        | cell
675
676*   Fixed problems with empty cells in the first column of a table with 
677    no leading pipe, like this one:
678	
679        header | header
680        ------ | ------
681               | cell
682
683*   Code spans containing pipes did not within a table. This is now 
684    fixed by parsing code spans before splitting rows into cells.
685
686*   Added the pipe character to the backlash escape character lists.
687
688Extra 1.0b1 (25 Jun 2005)
689
690*   First public release of PHP Markdown Extra.
691
692
693Copyright and License
694---------------------
695
696Copyright (c) 2004-2005 Michel Fortin  
697<http://www.michelf.com/>  
698All rights reserved.
699
700Based on Markdown  
701Copyright (c) 2003-2005 John Gruber   
702<http://daringfireball.net/>   
703All rights reserved.
704
705Redistribution and use in source and binary forms, with or without
706modification, are permitted provided that the following conditions are
707met:
708
709*   Redistributions of source code must retain the above copyright 
710    notice, this list of conditions and the following disclaimer.
711
712*   Redistributions in binary form must reproduce the above copyright
713    notice, this list of conditions and the following disclaimer in the
714    documentation and/or other materials provided with the 
715    distribution.
716
717*   Neither the name "Markdown" nor the names of its contributors may
718    be used to endorse or promote products derived from this software
719    without specific prior written permission.
720
721This software is provided by the copyright holders and contributors "as
722is" and any express or implied warranties, including, but not limited
723to, the implied warranties of merchantability and fitness for a
724particular purpose are disclaimed. In no event shall the copyright owner
725or contributors be liable for any direct, indirect, incidental, special,
726exemplary, or consequential damages (including, but not limited to,
727procurement of substitute goods or services; loss of use, data, or
728profits; or business interruption) however caused and on any theory of
729liability, whether in contract, strict liability, or tort (including
730negligence or otherwise) arising in any way out of the use of this
731software, even if advised of the possibility of such damage.
732