Skip to main content
Feedback

HTML to JSON Policy

Transform the response content from HTML to JSON.

This policy is based on the jsoup HTML parser.

Timing

On RequestOn ResponseOn Request ContentOn Response Content
X

Configuration

PropertyRequiredDescriptionType
jsonNameyesName of the JSON field to contain the result of the selection.string
selectoryesHTML/CSS selector used to select an element and get the text.string
arraynoDetermine if the selection needs to be returned as an array.boolean

Example

HTML content:

<html>
<body>
<table id="tid">
<thead>
<tr>
<th>name</th>
<th>age</th>
<th>major</th>
<th>student_id</th>
<th>gpa</th>
</tr>
</thead>
<tbody id="student">
<tr>
<td>Alice Johnson</td>
<td>20</td>
<td>Computer-Science</td>
<td>S123456</td>
<td>4.8</td>
</tr>
<tr>
<td>Bob Smith</td>
<td>22</td>
<td>Literature</td>
<td>S654321</td>
<td>4.2</td>
</tr>
<tr>
<td>Charlie Brown</td>
<td>21</td>
<td>Computer-Science</td>
<td>S789012</td>
<td>4.5</td>
</tr>
</tbody>
</table>
</body>
</html>

html-json configuration:

{
"selectors": [
{
"jsonName": "StudentName",
"selector": "#student tr td:first-child",
"array": true
},
{
"jsonName": "StudentId",
"selector": "#student tr td:nth-child(4)",
"array": true
},
{
"selector": "table#tid td",
"jsonName": "TableData",
"array": true
},
{
"jsonName": "CompleteData",
"selector": "body > table",
"array": true
}
]
}

Expected output:

{
"CompleteData": [
"name age major student_id gpa Alice Johnson 20 Computer-Science S123456 4.8 Bob Smith 22 Literature S654321 4.2 Charlie Brown 21 Computer-Science S789012 4.5"
],
"TableData": [
"Alice Johnson",
"20",
"Computer-Science",
"S123456",
"4.8",
"Bob Smith",
"22",
"Literature",
"S654321",
"4.2",
"Charlie Brown",
"21",
"Computer-Science",
"S789012",
"4.5"
],
"StudentName": [
"Alice Johnson",
"Bob Smith",
"Charlie Brown"
],
"StudentId": [
"S123456",
"S654321",
"S789012"
]
}
On this Page