|
Using text driver to parse denormalized data |
John Johnson |
2017-11-03 11:49:37 |
I develop solutions for an RSA product that includes a version of the HXTT Text JDBC driver. I have a situation where I need to parse data that is packed into a field from a CSV file; the value is basically a CSV within a CSV. I see there is a split function referenced in the documentation, but I cannot find any useful examples for how it can be used within a SELECT statement. My ultimate goal is to produce normalized query results.
Simple example, the input CSV might look like:
1,"a,b,c,d"
2,"e,f,g"
The output that I'm looking for is:
1 a
1 b
1 c
1 d
2 e
2 f
2 g
I know how to achieve the above with both Oracle and MS SQL, but it's not clear how to do this with the HXTT Text driver. Can you provide any suggestions?
|
Re:Using text driver to parse denormalized data |
HXTT Support |
2017-11-06 05:40:47 |
What's the sql in Oracle or MS SQL?
|
Re:Re:Using text driver to parse denormalized data |
John Johnson |
2017-11-06 06:26:10 |
The solutions are very different for each database. For Oracle, this is the basic pattern:
SELECT REGEXP_SUBSTR('a,b,c,d', '[^,]+', 1, LEVEL) VAL FROM DUAL
CONNECT BY REGEXP_SUBSTR('a,b,c,d', '[^,]+', 1, LEVEL) IS NOT NULL
For MS SQL, I found this article that describes a few different approaches:
https://stackoverflow.com/questions/5722700/how-do-i-split-a-delimited-string-in-sql-server-without-creating-a-function
Lack of support for "Common Table Expression" (CTE) and XML in the HXTT Text driver seem to get in the way for any of the approaches described in that thread.
The only thing I spotted in the documentation that might work is the SPLIT function that conceptually sounds like the STRING_SPLIT function in newer MS SQL Server versions. But I haven't been able to get any useful output.
|
Re:Re:Re:Using text driver to parse denormalized data |
HXTT Support |
2017-11-06 08:38:07 |
HXTT Text (CSV) supports CONNECT BY, CUBE, ROLLUP, CUBE, and so on. I don't think CONNECT BY can do it in Oracle for such a table.
For CSV, you can use:
SELECT column1,wantValue FROM testvi, (VALUES 'a','b','c','d','e','f','g' ) AS matchValues(wantValue) where INSTR(column2+',', wantValue+',')>0 ;
|
Re:Re:Re:Re:Using text driver to parse denormalized data |
HXTT Support |
2018-02-02 05:36:43 |
A new feature will be available in 48 hours.
Split Multivalue Column Into Rows
If each row has multiple multi value columns, a special subquery table can be used in special join sql. For instance,
select User,Role from aTable,(select split(aTable.Roles,',') as Role) AS bTable;
select User,Role,Year from aTable,(select split(aTable.Roles,',') as Role,split(aTable.Years,',') as Year) AS bTable;
|
|
|