Converting HTML Entities Into Character Code Equivalents
7 Sep2007
I'm working on a job for a client where legacy database data are being used to generate an XML document for processing with an XSLT stylesheet.
The data are encoded HTML entities in the database. So when I created my DOMDocument, I got the following warnings:
Warning: DOMDocument::loadXML() [function.DOMDocument-loadXML]: Entity 'middot' not defined in Entity, line: 963 in /usr/local/www/data-dist/sheds/includes/SDEHSFunctions.php on line 414
Instead of passing in '·' in the XML string to the constructor of the DOMDocument object, I needed to either declare all entities in the XML doctype (bothersome) or I needed to convert these text entities into numeric ones (eg. '·' becomes '·').
I took a look around and found this handy function:
http://php.net/get_html_translation_table
I did a print_r on the translation table returned and found that it returns an array where the key is the actual character represented and the element is the textual HTML entity. So here's a quick function to get the character coded equivalent:
This entry was posted on Friday, September 7th, 2007 at 7:01 pm author iain dooley, php, recipe, recipes, xml, html, xslt
Subscribe
Building software in the real world - the Working Software blog
We write about our experiences, ideas and interests in business, software and the business of software. We also sometimes write about our own products (in order to promote them).
Recent Posts
- RSS is only dead if we let it die
- Kill your index.php
- 18 Things I Wish I Knew 7 Years Ago
- When does automation become coding
- A list of things you can do to afford Mixergy Premium in 2012
- Thanks Louis now here is my dad
- Your templating engine sucks and everything you have ever written is spaghetti code yes you
- Energy for Opportunity website is now live
- Escaping single and double quotes in XPath queries in PHP
- The reason that outsourcing software is so difficult