Against Boredom: sql

Showing posts with label sql. Show all posts

2018-04-19

Java 8: JDBC ResultSet to Stream

Streams API is a gift.
But JDBC is still the old-school one.
Let's wrap it!

package org.example.jdbc.stream;

import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.Objects;
import java.util.Set;
import java.util.Spliterator;
import java.util.Spliterators;
import java.util.function.Consumer;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;

public final class StreamHelper {
    public static class Record {
        private final Map<String, Object> fields = new HashMap<>(16);
        private final long count;

        private Record(final ResultSet resultSet) throws SQLException {
            final ResultSetMetaData metaData = resultSet.getMetaData();
            count = metaData.getColumnCount();
            for (int i = 1; i <= count; ++i) {
                fields.put(metaData.getColumnName(i), resultSet.getObject(i));
            }
        }

        /**
         * Is there a column named like this?
         *
         * @param columnName is the column name in the query.
         * @return True if found.
         */
        public boolean contains(final String columnName) {
            return fields.containsKey(columnName);
        }

        /**
         * Number of columns.
         *
         * @return Numer of columns.
         */
        public long count() {
            return count;
        }

        /**
         * Get value casted to the requested type.
         * <p>
         * No type checking happens inside. It is your job to know the datatype in the database.
         * <p>
         * Example:<br>
         * {@code record.get("COLUMN1", Long.class); // returns a Long}
         *
         * @param columnName is the column name in the query.
         * @param type is Java type of the column.
         * @return The value casted to the Java type.
         */
        public <T> T get(final String columnName, final Class<T> type) {
            return type.cast(getObject(columnName));
        }

        /**
         * Get columns in the record.
         *
         * @return Collection of the column names.
         */
        public Set<String> getColumns() {
            return Collections.unmodifiableSet(fields.keySet());
        }

        /**
         * Get value as an object.
         *
         * @param columnName is the column name in the query.
         * @return The value.
         */
        public Object getObject(final String columnName) {
            return fields.get(columnName);
        }

        /**
         * Get value as string.
         *
         * @param columnName is the column name in the query.
         * @return Value as string.
         */
        public String getString(final String columnName) {
            return Objects.toString(fields.get(columnName));
        }

        /**
         * Is the given cell null?
         *
         * @param columnName is the column name in the query.
         * @return True if null.
         */
        public boolean isNull(final String columnName) {
            return getObject(columnName) == null;
        }

        @Override
        public String toString() {
            return fields.entrySet().stream().map(e -> e.getKey() + ": " + e.getValue())
                    .collect(Collectors.joining(", "));
        }
    }

    /**
     * Wrap a ResultSet in a Stream.
     * <p>
     * The wrapper consumes the result set. The caller must close the result set after the stream
     * processing was finished.
     *
     * @param resultSet is the open result set to streamline.
     * @return A stream of rows.
     */
    public static Stream<Record> asStream(final ResultSet resultSet) {
        // "est = Long.MAX_VALUE if infinite, unknown, or too expensive to compute."
        return StreamSupport.stream(new Spliterators.AbstractSpliterator<Record>(Long.MAX_VALUE,
                Spliterator.NONNULL | Spliterator.IMMUTABLE) {
            @Override
            public boolean tryAdvance(final Consumer<? super Record> action) {
                try {
                    if (!resultSet.next()) {
                        return false;
                    }
                } catch (@SuppressWarnings("unused") final SQLException e) {
                    return false;
                }
                try {
                    action.accept(new Record(resultSet));
                } catch (@SuppressWarnings("unused") final SQLException e) {
                    return false;
                }
                return true;
            }
        }, true).parallel();
    }

    private StreamHelper() { /* Hidden. */ }
}

Usage example:

import org.example.jdbc.stream.StreamHelper;
import org.example.jdbc.stream.StreamHelper.Record;
// ...
    @GET
    @Path("test")
    public Response test() {
        try (Connection connection = Database.connect();
                Statement statement = connection.createStatement();
                ResultSet resultSet = statement.executeQuery("SELECT * FROM MY_TABLE")) {
            return Response.ok(StreamHelper.asStream(resultSet).map(Record::toString)
                    .collect(Collectors.joining(", "))).build();
        } catch (final SQLException | NamingException e) {
            return Response.serverError().entity(e.toString()).build();
        }
    }

2018-04-04

Oracle: Safe, Inline, is_numeric Check

So there was a character column, but I wanted to get the rows where it is not actually a number.

TO_NUMBER(REGEXP_SUBSTR(MY_COLUMN, '^\d+')) IS NULL

If the input is empty, it will return null. When will the input be empty? If the cell is a not number-only string.

2015-04-23

JIRA: Additional indexes for performance

Some indices for purrformance, that piled up during the years:

-- Fix Kepler idiocies:
ALTER INDEX PK_KISSUESTATE RENAME TO ISSUE_STATE_IDX;
ALTER INDEX PK_KPLUGINSCFG RENAME TO PLUGIN_CFG_IDX;
ALTER INDEX PK_KSTATEVALUES RENAME TO STATE_VALUES_IDX;

-- Standard additions
CREATE INDEX IX_NOTIF_INST_MSGID ON NOTIFICATIONINSTANCE (MESSAGEID ASC);
CREATE INDEX IX_CHANGEGROUP_AUTHOR ON CHANGEGROUP (AUTHOR ASC); -- can be big
CREATE INDEX IX_CHANGEGROUP_AUTHOR_CREATED ON CHANGEGROUP (AUTHOR ASC, CREATED ASC); -- can be big
CREATE INDEX IX_CHANGEGROUP_AUTH_CREA_ISSUE ON CHANGEGROUP (AUTHOR ASC, CREATED DESC, ISSUEID ASC);
CREATE INDEX IX_FILEATTACHMENT_FILENAME ON FILEATTACHMENT (FILENAME ASC, '1'); -- See Note
CREATE INDEX IX_JIRAACTION_AUTHOR ON JIRAACTION (AUTHOR ASC);
CREATE INDEX IX_JIRAACTION_AUTHOR_CREATED ON JIRAACTION (AUTHOR ASC, CREATED ASC); 
CREATE INDEX IX_JIRAACTION_AUTH_CREAT_ISSUE ON JIRAACTION (AUTHOR ASC, CREATED DESC, ISSUEID ASC);
CREATE INDEX IX_JIRAACTION_TYPE_LEVEL ON JIRAACTION (ID ASC, ACTIONTYPE ASC, ACTIONLEVEL ASC); 
CREATE INDEX IX_JIRAISSUE_ISSUESTATUS ON JIRAISSUE (ISSUESTATUS ASC, '1');
CREATE INDEX IX_JIRAISSUE_PROJ_ASSIGNEE ON JIRAISSUE (PROJECT ASC, ASSIGNEE ASC);
CREATE INDEX IX_JIRAISSUE_PROJECT ON JIRAISSUE (PROJECT ASC, '1');
CREATE INDEX IX_JIRAISSUE_PROJECT_UPDATED ON JIRAISSUE (PROJECT ASC, UPDATED ASC);
CREATE INDEX issue_reporter ON JIRAISSUE (REPORTER ASC); -- lately introduced by Atlassian in JIRA 6.3
CREATE INDEX IX_JIRAISSUE_SUMMARY ON JIRAISSUE (SUMMARY ASC, '1'); -- See Note
CREATE INDEX IX_JIRAISSUE_UPDATED_PROJ_ID ON JIRAISSUE (UPDATED DESC, PROJECT ASC, ID ASC); -- for idiotic pagination queries
CREATE INDEX IX_JIRAISSUE_WF_PKEY_1 ON JIRAISSUE (WORKFLOW_ID ASC, PKEY ASC, '1');
CREATE INDEX IX_OS_WFENTRY_ID_NAME ON OS_WFENTRY (ID ASC, NAME ASC);

-- customfieldvalue is large. we need it to be created quickly, then turn off parallelism.
CREATE INDEX IX_CUSTOMFIELDVALUE_ISSUE ON CUSTOMFIELDVALUE (ISSUE ASC) PARALLEL 12; ALTER INDEX IX_CUSTOMFIELDVALUE_ISSUE NOPARALLEL; -- can be _really_ big depending on your CF usage

-- bitmap indexes:
CREATE BITMAP INDEX IX_OS_WFENTRY_STATE ON OS_WFENTRY (STATE ASC); -- degree of cardinality is typically below 1 ppm.

-- recreate these indexes:
DROP INDEX PRMSSN_SCHEME; CREATE BITMAP INDEX PRMSSN_SCHEME ON SCHEMEPERMISSIONS(SCHEME ASC);
DROP INDEX WF_ENTRYID; CREATE INDEX WF_ENTRYID ON OS_CURRENTSTEP (ENTRY_ID, STEP_ID);
DROP INDEX ISSUE_WORKFLOW; CREATE INDEX ISSUE_WORKFLOW ON JIRAISSUE (WORKFLOW_ID, ISSUESTATUS);

Note: On multi column indices containing a '1': The '1' here is actually a static char(1) column, always containing '1', and even if the first (the real) column's value is null, it will be stored in the index, and thus will be searchable.

JIRA: When were the watchers added to an issue?

SELECT
   SOURCE_NAME AS WATCHER,
   A.CREATED             ,
   TO_CHAR(A.CREATED, 'yyyy-mm-dd hh24:mi:ss') FANCY_CREATED
FROM
   USERASSOCIATION A
INNER JOIN JIRAISSUE I
ON
   I.ID = A.SINK_NODE_ID
INNER JOIN PROJECT P
ON
   P.ID = I.PROJECT
WHERE
   P.PKEY     = 'TICKT' AND
   I.ISSUENUM = '1114';

2014-06-10

Oracle: Get approximate size of storage used by indexes by a user

SELECT
  SEGMENT_NAME,
  ROUND(BYTES / 1048576)       AS MEGABYTES,
  ROUND(BYTES / 1073741824, 1) AS GIGABYTES
FROM DBA_SEGMENTS
WHERE SEGMENT_TYPE = 'INDEX' AND OWNER = '_YOURUSER_COMES_HERE_'
ORDER BY 2 DESC, 1 ASC;

2014-04-02

Oracle: Get UNIX epoch

SELECT
  ROUND((SYSDATE - TO_DATE('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS')) * 24 * 60 * 60) AS EPOCH
FROM DUAL;

The subtraction tells how many days is the difference, as a floating point number. Multiply it with 24 hours * 60 minutes * 60 seconds to scale it to seconds. Round it as we don't want sub-second precision.

If the time zones don't match up, use UTC:

SELECT
  ROUND((CAST(SYS_EXTRACT_UTC(SYSTIMESTAMP) AS DATE) - TO_DATE('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS')) * 24 * 60 * 60) AS EPOCH
FROM DUAL;

2014-02-24

JIRA: Get workflows where the post-function refers to a plugin

Note, that JIRAWORKFLOWS.DESCRIPTOR is a valid XML, stored in a CLOB.

WITH W_JIRAWORKFLOWS AS
  ( SELECT WORKFLOWNAME,
      XMLTYPE.CREATEXML( REPLACE(DESCRIPTOR,
                '<!DOCTYPE workflow PUBLIC "-//OpenSymphony Group//DTD OSWorkflow 2.8//EN" "http://www.opensymphony.com/osworkflow/workflow_2_8.dtd">'
        )) AS DESCRIPTOR_XML
    FROM JIRAWORKFLOWS)
SELECT
  WS.NAME      AS WF_SCHEME_NAME,
  WSE.WORKFLOW AS WF_ENTITY_WORKFLOW,
  IT.PNAME     AS ISSUETYPE_NAME,
  XMLQUERY(
      'string-join(distinct-values(//post-functions/function/arg[@name="class.name" and not(starts-with(., "com.atlassian.jira.workflow"))]/text()), ",")'
      PASSING DESCRIPTOR_XML
      RETURNING CONTENT
    ).getStringVal() AS PLUGIN_CLASSNAMES
FROM WORKFLOWSCHEMEENTITY WSE
LEFT JOIN ISSUETYPE IT
ON IT.ID = WSE.ISSUETYPE
LEFT JOIN WORKFLOWSCHEME WS
ON WS.ID = WSE.SCHEME
INNER JOIN W_JIRAWORKFLOWS W ON W.WORKFLOWNAME = WSE.WORKFLOW
WHERE XMLEXISTS('//post-functions[function/arg[@name="class.name" and not(starts-with(., "com.atlassian.jira.workflow"))]]'
      PASSING BY VALUE DESCRIPTOR_XML);

This way you can see, what kind of Workflows you have, that has post-functions bound to plugins.

As you see, it is using Oracle XDB features, so you might need some rework when using another RDBMS. Also, the cutting out of the DTD is neccessary, as by default, XDB does not allow to load XML-s that have external references (here, the DTD has one).

JIRA: List Workflow Schemes, their Workflows, and associated Issue Types

SELECT
  WS.NAME      AS WF_SCHEME_NAME,
  WSE.WORKFLOW AS WF_ENTITY_WORKFLOW,
  IT.PNAME     AS ISSUETYPE_NAME
FROM
  WORKFLOWSCHEMEENTITY WSE
LEFT JOIN ISSUETYPE IT
ON
  IT.ID = WSE.ISSUETYPE
LEFT JOIN WORKFLOWSCHEME WS
ON
  WS.ID = WSE.SCHEME;

2013-09-10

Confluence: Search for things skipping the shipped "search engine"

Using Atlassian Confluence, and have to report on exact things, like "How many, and which pages refer to a given phrase"?
This searches through all the latest version of all the pages:

SELECT DISTINCT
   'http://confluence.example.com/confluence/pages/viewpage.action?pageId=' || contentid AS URL
FROM
   CONTENT
INNER JOIN BODYCONTENT USING(contentid)
WHERE
   prevver      IS NULL
   AND SPACEID  IS NOT NULL
   AND PARENTID IS NOT NULL
   AND REGEXP_LIKE(BODY, 'internal-webserver\d(\.example\.com)?\\public', 'i');

HP BSM: Using Oracle Services instead of SID (useful if having a RAC)

This cannot be set on the wizard, this has to be done by hand. Origin of this information

Log into the server. (rdp or other)
Stop BSM.

Create the file D:\HPBSM\conf\bsm-tnsnames.ora with content:

hpbsm_service = 
(DESCRIPTION = 
    (ADDRESS_LIST = 
        (ADDRESS = 
            (PROTOCOL = TCP)
            (HOST = oraclerac-vip.example.com)
            (PORT = 1521)
        )
        (LOAD_BALANCE = on)
        (FAILOVER = on)
    )
    (CONNECT_DATA = 
        (SERVER = DEDICATED)
        (SERVICE_NAME = hpbsm_service)
    )
)

Edit D:\HPBSM\conf\jdbc.drivers.properties
1. Comment out the line starting with: ddoracle.url=
2. Add new line (literally): ddoracle.url=jdbc:mercury:oracle:TNSNamesFile=D:\\HPBSM\\conf\\bsm-tnsnames.ora;TNSServerName=${sid}

Create the file D:\HPBSM\odb\conf\jdbc.properties:

Oracle = ddoracle
cmdb.url = jdbc:mercury:oracle:TNSNamesFile=D:\\HPBSM\\conf\\bsm-tnsnames.ora;TNSServerName=hpbsm_service

(here you have to include the service name from the bsm-tnsnames.ora)

Run the "Configure HP Business Service Management" utility, and enter hpbsm_service in the SID field (it will be converted to a tnsnames.ora entry lookup by our hack above).

Connect to hp_mgr with SQL Developer or TOAD (or whatever you like):

SELECT SESSION_ID, SESSION_DB_SID FROM sessions WHERE SESSION_DB_NAME = 'profile_mgr'; -- Must be hpbsm_service. If not, update.
SELECT SP_ID, SP_VALUE FROM setting_parameters WHERE SP_NAME IN ('opr.db.connection.sid', 'dal.bpi.runtime.sid', 'dal.bpi.pr.repos.sid');  -- All must be hpbsm_service. If not, update.

UPDATE SESSIONS SET SESSION_DB_SID = 'hpbsm_service' WHERE SESSION_ID IN (SELECT SESSION_ID FROM SESSIONS WHERE SESSION_DB_NAME = 'profile_mgr');
UPDATE SETTING_PARAMETERS SET SP_VALUE = 'hpbsm_service' WHERE SP_ID IN (SELECT SP_ID FROM SETTING_PARAMETERS WHERE SP_NAME IN ('opr.db.connection.sid', 'dal.bpi.runtime.sid', 'dal.bpi.pr.repos.sid'));

Start BSM.
VALIDATE

HP BSM: Removing licenses

As you already know, you cannot remove licenses through the BUI. The only way there is to remove it from the database directly!
Log in to the DB, into the hp_mgr schema, and:

SELECT license_key FROM licenses;

The rows ending with a "T*" identifier are the actual licenses. Select the license you want to remove, and delete that row (note: the license_key is the key candidate here), and restart BSM, to get the change in effect!

Useful, if you have mistakenly added a prod license to your dev tier, or vice versa.

Jira: Get latest activity for all users after a given date

This query lists active users after a given date. Useful for filtering out inactive users.
In this example, list users who were active in this year (2013):

WITH w_all_user_activities AS
  (SELECT 
    author,
    g.created
  FROM jiraissue i
  INNER JOIN changegroup g ON i.id = g.issueid
  INNER JOIN changeitem c ON c.groupid = g.id
  UNION
  SELECT author, created FROM jiraaction
  UNION
  SELECT reporter, created FROM jiraissue
),
w_tmp_latest_user_activities AS
  (SELECT
    author,
    created,
    RANK() OVER (PARTITION BY author ORDER BY created DESC) dest_rank
  FROM w_all_user_activities
),
w_latest_user_activities AS
  (SELECT 
    author,
    created
  FROM w_tmp_latest_user_activities
  WHERE dest_rank = 1
)
SELECT * FROM w_latest_user_activities 
WHERE created >= TO_DATE('2013-01-01', 'yyyy-mm-dd');

Jira: SQL for "Check workflow entry states are correct" integrity check

If this query returns any rows, run the "Check workflow entry states are correct" action in the Integrity Checker.
This issue happens as a result of the lack of using database transactions during the workflow transitions (which changes numerous tables) in Jira.

SELECT jiraissue.pkey, jiraissue.workflow_id, OS_WFENTRY.*
FROM jiraissue
INNER JOIN OS_WFENTRY ON jiraissue.workflow_id = OS_WFENTRY.id
WHERE OS_WFENTRY.STATE IS NULL OR OS_WFENTRY.STATE = 0;

Update: For JIRA version 6.x I suggest reading this: http://kozelljozsef.blogspot.com/2014/11/jira-check-workflow-entry-states-are-correct.html

Jira: List users assigned explicitly to project roles

Assigning people to project roles explicitly is a bad habit, you should avoid it, as it will lay much work on your shoulders when a user's privileges will be changed.

SELECT actor.roletypeparameter AS username, r.name AS rolename, p.pkey, p.pname AS projectname
FROM projectroleactor actor
INNER JOIN cwd_user u ON u.lower_user_name = actor.roletypeparameter
INNER JOIN projectrole r ON actor.projectroleid = r.ID
INNER JOIN project p ON actor.pid = p.ID;

Jira: Activity Stream stopped working, throws error

Using Atlassian Jira, and having faulty activity streams? Also, using Oracle?
You have an issue (usually from an email), where the subject is empty.
In Oracle, the empty string is stored as null.
Therefore, when Jira reads back an issue, its subject will be null (not empty string).
This screws up the Activity Stream gadget, as it calls String.substring() on it, without any error checking.

Get rid of the null subjects.

UPDATE JIRAISSUE SET SUMMARY = '(no subject)' WHERE SUMMARY IS NULL;

Jira: "Passed List had more than one value."

This error happens in Atlassian Jira, as they maliciously and intentionally avoid using foreign keys and unique indices in their table design.
If you see the following error in your tomcat log, you have to do a cleanup:

2013-01-01 01:01:01,111 ajp-8009-1 ERROR      [500ErrorPage.jsp] Exception caught in 500 page org.apache.jasper.JasperException: java.lang.RuntimeException: java.lang.IllegalArgumentException: Passed List had more than one value. 
org.apache.jasper.JasperException: org.apache.jasper.JasperException: java.lang.RuntimeException: java.lang.IllegalArgumentException: Passed List had more than one value. 
(...)
Caused by: java.lang.IllegalArgumentException: Passed List had more than one value. 
        at org.ofbiz.core.entity.EntityUtil.getOnly(EntityUtil.java:62) 
(...)

This actually deletes all the duplicated rows, to make every row to be unique (over 'username', at least):

DELETE FROM columnlayout
WHERE ROWID IN (
      SELECT "ROWID" FROM (
            SELECT RANK() OVER(PARTITION BY USERNAME ORDER BY ROWID) RANK_N, ROWID AS "ROWID" FROM columnlayout
            WHERE USERNAME IN (
                  SELECT USERNAME FROM columnlayout
                  GROUP BY USERNAME
                  HAVING COUNT(*) > 1
            )
      )
      WHERE RANK_N > 1
);

Jira: deleted user coming from external directory (like LDAP)

You are using Atlassian Jira, but your users are authenticated from LDAP (AD/whatever)?
Surely, when one of your users leaves or gets her name changed, you'll be stuck with malfunctioning watcher lists, as Jira does not use any of those pesky "relational" stuff in RDBMS-s (valid of any DBMS vendor).
Get rid of the faulty associations.

DELETE FROM userassociation 
WHERE LOWER(source_name) IN (
    SELECT DISTINCT LOWER(source_name) FROM userassociation
    MINUS
    SELECT DISTINCT lower_user_name FROM cwd_user)

Jira: "No scrollbar" problem

Using Atlassian Jira, and an issue view page does not have scrollbars but it should?
You've got an email containing "overflow: hidden". Get rid of it.

This will list you all the issues where the description contains the offending CSS.

SELECT * FROM jiraissue
WHERE REGEXP_LIKE(description, 'overflow: *hidden', 'i');

2013-08-29

SQL: why you cannot use regex to validate a select

Regular expressions can match languages only a finite state automaton can parse, which is very limited, whereas SQL is a syntax. It can be demonstrated you can't validate SQL with a regex. So, you can stop trying.
SQL is a type-2 grammar, it is too powerful to be described by regular expressions. It's the same as if you decided to generate C# code and then validate it without invoking a compiler. Database engine in general is too complex to be easily stubbed.

2013-08-07

Oracle: delete duplicate rows from a table, to have only one row for each unique expression

DELETE FROM EXTERNAL_ENTITIES
WHERE
   ROWID IN
   (
      SELECT "ROWID" -- ESCAPED, BECAUSE WE NEED THE INNER ROWID, NOT THIS TMPTABLE'S ROWID
      FROM
         (
            SELECT
               RANK() OVER(PARTITION BY NAME ORDER BY ROWID) RANK_N, -- ORDER IN THE SUBGROUP
               ROWID AS "ROWID"                                    , -- ESCAPE THE ROWID
            FROM EXTERNAL_ENTITIES
            WHERE
               NAME IN
               (
                  SELECT NAME
                  FROM EXTERNAL_ENTITIES
                  GROUP BY NAME
                  HAVING COUNT(*) > 1 -- ONLY DUPLICATES OR MORE
               )
         )
      WHERE RANK_N > 1
   );

DELETE FROM CUSTOMFIELDVALUE
WHERE ROWID IN
   (SELECT T."ROWID" FROM
      (SELECT RANK() OVER(PARTITION BY ISSUE, CUSTOMFIELD ORDER BY ROWID) RANK_N, ROWID AS "ROWID"
         FROM CUSTOMFIELDVALUE
         WHERE CUSTOMFIELD IN (10180, 10216, 10336, 10340, 10464, 10550, 10590, 10911, 14714) -- customfield ID-s come here, taken from the log.
      ) T
   WHERE T.RANK_N > 1);

Subscribe to: Posts ( Atom )