The following common functions are supported in Luminesce SQL queries:
 Most of the builtin scalar SQL functions.
 The SQLite date and time functions.
 The mathematical and string functions exposed by the SQLite common extension library (C source file).
 SQL Window functions.
 The functions exposed by the JSON1 extension library.
We have also implemented the following custom functions:
General custom functions
General custom function  Explanation and examples 
THROW 
Can be used conditionally to throw an error message, for example:
select iif(abc is not null, abc, throw('abc cannot be null'))

PRINT 
Can be used to debug a query, give feedback to an end user running the query, or enter text into the logs. Takes the following arguments:
select print('*** Starting your process NOW!')
=> [INF] LuminesceCli: >> Progress >> PRINT >> *** Starting your process NOW! select print('*** Starting your process NOW! {X:00000}', 'error', 'START!!!', 42)
=> [ERR] LuminesceCli: >> Progress >> START!!! >> *** Starting your process NOW! 00042 select print('*** Starting your process NOW! {X}', 'error', 'START!!!', 42)
=> [ERR] LuminesceCli: >> Progress >> START!!! >> *** Starting your process NOW! 42 
DAY_COUNT_DISPLAY 
Converts a fractional day number to a more intuitive count of days, hours, minutes, seconds and so on.
select
day_count_display(3.1234567) a, day_count_display(0.00123) b, day_count_display(0.00000123) c, day_count_display(julianday('now', '+2 minutes')  julianday('now', '1 hours', '+3 minutes', '12 seconds')) d => ┌───────────┬────────┬───────┬─────────┐ │ a │ b │ c  d │ ├───────────┼────────┼───────┼─────────┤ │ 3d 2h 57m │ 1m 46s │ 106ms │ 59m 12s │ └───────────┴────────┴───────┴─────────┘ 
TO_DATE 
Performs an explicit date conversion from a string in a nonstandard format, for example:
select to_date('2022/29/03', 'yyyy/dd/MM') as aDate

REGEXP 
Enables filtering by regular expression matches. Note this function is used as an operator and so behaves differently to the other regular expression functions below. For example:
select * from x where abc REGEXP '([AZ]*[09]*)'

REGEXP_MATCH 
Returns the first portion of the search string that matches the regular expression, or null if there are no matches. The parameter syntax is as follows: (searchstring, regex [, nameofcapturegrouptoreturn]) . For example:
 returns: ABC123
select regexp_match('XXX: ABC123 Something', '([AZ]+[09]*)') select regexp_match('XXX: ABC123 Something', '([azAZ]+[09]*)')  returns: null (due to not matching on case) select regexp_match('XXX: ABC123 Something', '([az]+[09]*)')  returns: ABC select regexp_match('XXX: ABC123 Something', '(?<project>[AZ]+)(?<id>[09]*)', 'project') 
REGEXP_MATCHES 
Returns the portions of the search string that match the regular expression as a commaseparated list, or null if there are no matches. The parameter syntax is as follows: (searchstring, regex) . For example:
 returns: ABC123,QQQ456
select regexp_matches('XXX: ABC123 something QQQ456 Something', '([AZ]+[09]*)') 
REGEXP_MATCH_LOCATION 
Returns the 1based index of the first portion of the search string that matches the regular expression, or null if there are no matches. The parameter syntax is as follows: (searchstring, regex [, nameofcapturegrouptoreturn]) . For example:
 returns: 6
select regexp_match_location('XXX: ABC123 Something', '([AZ]+[09]*)') select regexp_match_location('XXX: ABC123 Something', '([azAZ]+[09]*)') select regexp_match_location('XXX: ABC123 Something', '(?<project>[AZ]+)(?<id>[09]*)')  returns: null (due to not matching on case) select regexp_match_location('XXX: ABC123 Something', '([az]+[09]*)')  returns: 10 select regexp_match_location('XXX: ABC123 Something', '(?<project>[AZ]+)(?<id>[09]*)', 'id') 
REGEXP_REPLACE 
Replaces the portions of the search string that match the regular expression with the replacement value. The parameter syntax is as follows: (searchstring, regex, replacementvalue) . For example:
 returns: XXX: ? something ? Something
select regexp_replace('XXX: ABC123 something QQQ456 Something', '([AZ]+[09]*)', '?') 
Statistical custom functions
Explanation  Luminesce SQL syntax  Equivalent Lumipy method 

DescriptionThe coefficient of variation is the standard deviation scaled by the mean. It is a standardised measure of the dispersion of a random variable so distributions of different scale can be compared [1] Inputs

coefficient_of_variation([x]) 
x.stats.coef_of_variation() 
DescriptionCovariance is a statistical measure of the joint variability of two random variables [1] Inputs

covariance([x], [y], ddof) 
x.stats.covariance(y, ddof) 
DescriptionThe empirical CDF is the cumulative distribution function of a sample. It is a step function that jumps by 1/n at each of the n data points. This function returns the value of the empirical CDF at the given value [1] Inputs

empirical_cume_dist_function([x], value) 
x.stats.empirical_cdf(value) 
DescriptionThe Shannon entropy measures the average amount of "surprise" in a sequence of values. It can be considered a measure of variability [1] It is calculated as
where p_i is the probability of the ith value occurring computed from the sample (n occurrences / sample size). This function is equivalent to scipy.stats.entropy called with a single series and with the natural base [2] Inputs

entropy([x]) 
x.stats.entropy() 
DescriptionThe geometric mean is the multiplicative equivalent of the normal arithmetic mean. It multiplies a set of nmany numbers together and then takes the nth root of the result [1] Inputs

geometric_mean([x]) 
x.stats.geometric_mean() 
DescriptionThe geometric standard deviation measures the variability of a set of numbers where the appropriate mean to use is the geometric one (they are more appropriately combined by multiplication rather than addition) [1] This is computed as the exponential of the standard deviation of the natural log of each element in the set
Inputs

exp(window_stdev(log([x]))) 
x.stats.geometric_stdev() 
DescriptionThe harmonic mean is the reciprocal of the mean of the individual reciprocals of the values in a set [1] Inputs

harmonic_mean([x]) 
x.stats.harmonic_mean() 
DescriptionThe interquantile range is the difference between two different quantiles. This is a generalisation of the interquartile range where q1=0.25 and q2=0.75. The upper quantile (q2) value must be greater than the lower quantile (q1) value. Inputs

interquantile_range([x], [q1], [q2]) 
x.stats.interquantile_range() 
DescriptionThe interquartile range is the difference between the upper and lower quartiles. It can be used as a robust measure of the variability of a random variable [1] Inputs

interquartile_range([x]) 
x.stats.interquartile_range() 
DescriptionKurtosis measures how much probability density is in the tails (extremes) of a sample's distribution [1] This function corresponds to the Pearson Kurtosis measure and currently only supports sample kurtosis. Inputs

kurtosis([x]) 
x.stats.kurtosis() 
DescriptionThe lower quartile is the value that bounds the lower quarter of a dataset [1] It is equivalent to quantile 0.25 or the 25th percentile. Inputs

quantile([x], 0.25) 
x.stats.lower_quartile() 
DescriptionThis is a convenience function for computing the mean divided by the standard deviation. This is used in multiple financial statistics such as the Sharpe ratio and information ratio. Inputs

mean_stdev_ratio([x]) 
x.stats.mean_stdev_ratio() 
DescriptionThe median is the value that separates the top and bottom half of a dataset [1] It is equivalent to quantile 0.5, or the 50th percentile. Inputs

quantile([x], 0.5) 
x.stats.median() 
DescriptionThe median absolute deviation is a measure of the variability of a random variable. Unlike the standard deviation it is robust to the presence of outliers [1] Inputs

median_absolute_deviation([x]) 
x.stats.median_abs_deviation() 
DescriptionPearson's r is a measure of the linear correlation between two random variables [1] Inputs

pearson_correlation([x], [y]) 
x.stats.pearson_r(y) 
DescriptionThe quantile function of a given random variable and q value finds the value x where the probability of observing a value less than or equal to x is equal to q [1] Inputs

quantile([x], q) 
x.stats.quantile(q) 
DescriptionThe Root Mean Square (RMS) is the square root of the mean of the squared values of a set of values. It is a statistical measure of the spead of a random variable [1] Inputs

root_mean_square([x]) 
x.stats.root_mean_square(x) 
DescriptionSkewness measures the degree of asymmetry of a random variable around its mean [1] This calculation currently only supports sample skewness. Inputs

skewness([x]) 
x.stats.skewness(x) 
DescriptionSpearman's rho measures how monotonic the relationship between two random variables is [1] Inputs

spearman_rank_correlation([x], [y]) 
x.stats.spearman_r(y) 
DescriptionAn implementation of standard deviation that can be used in a window. The standard deviation measures the dispersion of a set of values around the mean [1] This only calculates the sample standard deviation (delta degrees of freedom = 1) Inputs

window_stdev([x]) 
x.stats.stdev() 
DescriptionThe upper quartile is the value that bounds the upper quarter of a dataset [1] It is equivalent to quantile 0.75 or the 75th percentile. Inputs

quantile([x], 0.75) 
x.stats.upper_quartile() 
DescriptionComputes the y intercept (alpha) of a regression line fitted to the given data [1] Inputs

linear_regression_alpha([x], [y]) 
x.linreg.alpha(y) 
DescriptionComputes the standard error of the y intercept (alpha) of a regression line fitted to the given data [1] This assumes the residuals are normally distributed and is calculated according to [2] Inputs

linear_regression_alpha_error([x], [y]) 
x.linreg.alpha_std_err(y) 
DescriptionComputes the gradient of a regression line (beta) fitted to the given data [1] Inputs

linear_regression_beta([x], [y]) 
x.linreg.beta(y) 
DescriptionComputes the standard error of the gradient (beta) of a regression line fitted to the given data [1] This assumes the residuals are normally distributed and is calculated according to [2] Inputs

linear_regression_beta_error([x], [y]) 
x.linreg.beta_std_err(y) 
DescriptionThe BrayCurtis distance is the elementwise sum of absolute differences between elements divided by the absolute value of their sum [1] Inputs

braycurtis_distance([x], [y]) 
x.metric.braycurtis_distance(y) 
DescriptionThe Canberra distance is the elementwise sum of absolute differences between elements divided by the sum of their absolute values. It can be considered a weighted version of the Manhattan distance [1] Inputs

canberra_distance([x], [y]) 
x.metric.canberra_distance(y) 
DescriptionThe Chebyshev distance is the greatest difference between dimension values of two vectors. It is equivalent to the Minkowski distance as p → ∞ [1] Inputs

chebyshev_distance([x], [y]) 
x.metric.chebyshev_distance(y) 
DescriptionThe cosine distance is the cosine of the angle between two vectors subtracted from 1 [1] Inputs

cosine_distance([x], [y]) 
x.metric.cosine_distance(y) 
DescriptionThe Euclidean distance is the familiar 'as the crow flies' distance. It is the square root of the sum of squared differences between the elements of two vectors [1] Inputs

euclidean_distance([x], [y]) 
x.metric.euclidean_distance(y) 
DescriptionThe Fscore is a classifier performance metric which measures accuracy. It is defined as the weighted harmonic mean of precision and recall scores. The beta parameter controls the relative weighting of these two metrics. The most common value of beta is 1: this is the F_1 score (aka balanced Fscore). It weights precision and recall evenly. Values of beta greater than 1 weight recall higher than precision and less than 1 weights precision higher than recall [1] Inputs

fbeta_score([x], [y], beta) 
x.metric.f_score(y, beta) 
DescriptionThe Manhattan distance (aka the taxicab distance) is the absolute sum of differences between the elements of two vectors. It is the distance traced out by a taxicab moving along a city grid like Manhattan where the distance travelled is the sum of the sides of the squares [1] Inputs

manhattan_distance([x], [y]) 
x.metric.manhattan_distance(y) 
DescriptionThe mean absolute error is the mean absolute elementwise difference between two series of values [1] It is a common performance metric for regression models where one series is the predicted values and the other series is the observed values. Inputs

mean_absolute_error([x], [y]) 
x.metric.mean_absolute_error(y) 
DescriptionThe mean fractional absolute error is the mean absolute elementwise fractional difference between two series of values. It is a scaleinvariant version of the mean absolute error [1] It is a common performance metric for regression models where one series is the predicted values and the other series is the observed values. Inputs

mean_fractional_absolute_error([x], [y]) 
x.metric.mean_fractional_absolute_error(y) 
DescriptionThe mean squared error between two series of values is the mean of the squared elementwise differences [1] It is a common performance metric for regression models. Inputs

mean_squared_error([x], [y]) 
x.metric.mean_squared_error(y) 
DescriptionThe Minkowski distance is a generalisation of the Euclidean (p=2) or Manhattan (p=1) distance to other powers p [1] Inputs

minkowski_distance([x], [y], p) 
x.metric.minkowski_distance(y, p) 
DescriptionPrecision is a classification performance metric which measures the fraction of true positive events in a set of events that a classifier has predicted to be positive. It is calculated as follows
where tp is the number of true positives and fp is the number of false positives. Precision is a measure of the purity of the classifier's positive predictions. [1] It is also known as the positive predictive value and purity. Inputs

precision_score([x], [y]) 
x.metric.precision_score(y) 
DescriptionRecall is a classification performance metric which measures the fraction of positive events that are successfully predicted by a classifier. It is calculated as follows
where tp is the number of true positives and fn is the number of false negatives. Recall is a measure of the efficiency of the classifier at retrieving positive events. [1] It is also known as sensitivity, hit rate, true positive rate (TPR) and efficiency. Inputs

recall_score([x], [y]) 
x.metric.recall_score(y) 
DescriptionGainloss ratio is the mean positive return of the series divided by the mean negative return of the series. Inputs

gain_loss_ratio([x]) 
x.finance.gain_loss_ratio() 
DescriptionThe information ratio is the mean excess return between a return series and a benchmark series divided by the standard deviation of the excess return. Inputs

mean_stdev_ratio([x]  [y]) 
x.finance.information_ratio(y) 
DescriptionDrawdown is calculated as
where x_h is high watermark (max) up to and including the point x_i [1] Max drawdown is then the maximum value of the drawdowns dd_i over the sequence of values. This aggregation assumes that the column is in time order. This calculation will be applied before ORDER BY in SQL syntax, so you should consider turning the table containing the data into a table variable, ordering that by the timeequivalent column and then applying the method to the corresponding column in a select statement on the table variable. Inputs

max_drawdown([x]) 
x.finance.max_drawdown() 
DescriptionDrawdown length is calculated as the number of rows between the high watermark value and the current row. [1] The max drawdown length is then the maximum value of the drawdown length in the time period. This aggregation assumes that the column is in time order. This calculation will be applied before ORDER BY in SQL syntax, so you should consider turning the table containing the data into a table variable, ordering that by the timeequivalent column and then applying the method to the corresponding column in a select statement on the table variable. Inputs

max_drawdown_length([x]) 
x.finance.max_drawdown_length() 
DescriptionDrawdown is calculated as
where x_h is high watermark (max) up to and including the point x_i [1] Mean drawdown is then the mean value of the drawdown over the sequence of values. This aggregation assumes that the column is in time order. This calculation will be applied before ORDER BY in SQL syntax, so you should consider turning the table containing the data into a table variable, ordering that by the timeequivalent column and then applying the method to the corresponding column in a .select() on the table variable. Inputs

mean_drawdown([x]) 
x.finance.mean_drawdown() 
DescriptionDrawdown length is calculated as the number of rows between the high watermark value and the current row. [1] The mean drawdown length is then the mean value of the drawdown length in the time period. This aggregation assumes that the column is in time order. This calculation will be applied before ORDER BY in SQL syntax, so you should consider turning the table containing the data into a table variable, ordering that by the timeequivalent column and then applying the method to the corresponding column in a .select() on the table variable. Inputs

mean_drawdown_length([x]) 
x.finance.mean_drawdown_length() 
DescriptionCompute a returns series from a prices series. Supports scaling by a simple scale factor or compounding. Inputs

prices_to_returns([Price], 1, 1.0, 0) OVER( ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) 
x.finance.prices_to_returns(1, 1, False) 
DescriptionCompute a price series from a returns series and an initial value. Supports scaling by a simple scale factor or compounding. Inputs

returns_to_prices([rets], 400, 1.0, 0) OVER( ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) 
x.finance.returns_to_prices(400, 1, False) 
DescriptionSemideviation is the standard deviation of values in a returns series below the mean return value. Inputs

semi_deviation([x]) 
x.finance.semi_deviation() 
DescriptionThe Sharpe ratio is calculated as the mean excess return over the riskfree rate divided by the standard deviation of the excess return. Inputs

mean_stdev_ratio([x]  [r]) 
x.finance.sharpe_ratio(r) 
DescriptionThe tracking error is the standard deviation of the difference between an index's return series and a benchmark. Inputs

window_stdev([x]  [y]) 
x.finance.tracking_error(y) 