[Kde-finance-apps] Brainstorming requirements for an Alkimia Quotes Backend: Feedback/Reality-Check Needed Please :)
Brian Cappello
briancappello at gmail.com
Fri May 21 23:58:16 CEST 2010
Yahoo splits up historical data compared to company infos, which I think
makes sense, so this is also based upon such a split. Sorry; it's a bit
long... But that's partly because copying from OO.org apparently likes to
insert lots of extra lines...
Intraday Quotes/Company Info/Fundamentals: Query the backend with a ticker
symbol and desired parameter(s?); get results back.
Input:
Symbol
Output Parameters: [What does everybody want? What am I forgetting/don't
know about? What kind of a balance between “base” statistics and “derived”
statistics are we trying to achieve?]
(high priority: required)
struct (OHLCV) [Date/Time, Open, High, Low, Close, Volume[, Open Interest]]
(Should this strictly be a part of the historical backend, or also provided
here for convenience?)
struct (POHLCV) [[Previous] Date/Time, Open, High, Low, Close, Volume[, Open
Interest]] (Should this strictly be a part of the historical backend, or
also provided here for convenience?)
qstring Company Name (NAME)
qstring Company Industry (INDUSTRY)
qstring Market Capitalization (MKTCAP) [string because Yahoo provides this
as X.XXK/M/B]
(high priority: Priorities are based upon what I've been told/read is
important, but that's not saying much. Anybody who knows what
buy-and-hold-types find important and/or knows their fundamental analysis
should probably re-order/re-write this list, because I *certainly* don't
know my FA...)
[Much of everything below has the potential to be N/A, so this needs to be
somehow gracefully handled.]
--Annualized--
double Dividend Rate (DIVR)
--Year over Year--
double Quarterly Revenue Growth (QRG) [This is a +/-percentage.]
double Quartely Earnings Growth (QEG) [This is a +/-percentage.]
--Most Recent Quarter--
qstring Total Cash (TC) [string because Yahoo provides this as X.XXK/M/B]
qstring Total Debt (TD) [string because Yahoo provides this as X.XXK/M/B]
double Profit Margin (PM) [This is a +/-percentage.]
double Operating Margin (OM) [This is a +/-percentage.]
double Return on Assets (ROA) [This is a +/-percentage.]
double Return on Equity (ROE) [This is a +/-percentage.]
(medium priority)
qstring Shares Outstanding (SOUTSTANDING) [string because Yahoo provides
this as X.XXK/M/B]
qstring Float (SFLOAT) [string because Yahoo provides this as X.XXK/M/B]
qstring Shares Short (SSHORT)
double Percent Held by Insiders (PINSIDERS)
double Percent Held by Institutions (PINSTITUTIONS)
double Shares Short Percent of Float (PSHORT)
<Other sentiment indicators, especially some based on volume.> [But Yahoo
doesn't provide much beyond these listed, so I need to do some research into
a different source.]
(low priority.)
double Trailing or Forward Price/Earnings (TP/E, FP/E)
double Earnings per Share (EPS)
double Diluted Earnings per Share (DEPS)
double Price/Sales (P/S)
double Price/Book (P/B)
qstring Revenue (REV) [string because Yahoo provides this as X.XXK/M/B]
double Revenue Per Share (RPS)
double Total Cash per Share (TCPS)
double Book Value per Share (BVPS)
(really low priority)
<Various derived price/history performance stuffs.>
<Other random stuffs from Yahoo's Key Statistics page.>
(high priority: required)
Historical Data: Downloaded and/or imported locally from CSV. Support CSV
export. Primary source (Yahoo/OtherInternetz/local CSV directory) should be
configurable (to not exclude users who have paid for higher quality data
sources than Yahoo).
Public interface: Provide a symbol, time period, select absolute/adjusted
prices, provide a end date/time (today/now if none provided), the number of
bars back from the end to return (supporting both directly as an int or as a
specified length of time units eg 5 years), and (optionally) a maximum
lookback period so indicators can use “hidden” data; get results back. There
is the possibility for less bars than requested to be returned, so the
returned object must be able to accommodate such information. [Why not just
specify a start date? Mostly just because client views (layout maths) should
not need to be concerned with what the source/resulting time period is; they
should just be able to say how much data they need (or will fit) in the
easiest way possible and the backend figures out the rest.]
Internal Historical Stuffs: The following is (somewhat) structured around
the limitations of how/what data Yahoo provides, but should still be
flexible enough to accommodate data from paid-providers or other sources.
I imagine some form of a standard historical container list that can
accommodate storing any time period's data (w/ each index containing the
Open Date [or Time] and the OHLCV data [plus Open Interest]). Each container
should hold everything from the IPO (or how every many years the user wants)
up through the most recent (partial) close available, and be generic enough
that any stored time period can serve as the input to any of the
price-manipulation algorithms listed below. Every watched symbol
(configurably) maintains multiple containers, each holding one of the
following “primary” time periods:
Minutely - Unless somebody knows of a free bulk source, as far as I'm aware
this data needs to be “ripped” by sampling intraday quotes from the open
through close (probably at least twice a minute to ensure no single minute
gets skipped?).
Daily – Can be downloaded in bulk, specifying a start and end date.
Officially, Yahoo data only goes back to the early 70s, but many symbols go
back further and the oldest I've come across is 1929. Just using a global
default of 1/1/1900 works fine. The end date, if unspecified, defaults to
the latest date available. [However, this doesn't include holidays or days
the market was closed for special circumstances. [See “time period
conversions” below for why this matters. Therefore, we also store the
following two primary time periods]:
Weekly – Same download URL/conditions as daily, but specifying weekly.
Monthly – Same download URL/conditions as daily, but specifying monthly.
Time period conversions: The simplest way I've found to do this is to
ignore the original time period and just sum up the source data into the
desired period. >1 period conversions should be relative to the oldest data
available (or maybe the first Monday or full new month/year?), otherwise the
most recent bar will always be a “full” period.. What I like most about this
approach is that the same algorithm can be used for converting each primary
time period into greater time periods, while simultaneously providing a
reasonable balance between memory usage and processing resources. The
biggest caveat to this method – which is really just a matter of
definition/preference – is that for example using a time period of 5 on
daily data will produce different results compared to “real” weekly data.
Adjusted Prices – Yahoo provides an adjusted close alongside the absolute
prices, but that's all. The adjusted OHL still need to be calculated using
dividends and/or splits.
Dividends – Same download URL/conditions as daily, but specifying dividends.
Splits – I'm not aware of an easy way to download splits in bulk, at least
from Yahoo. They can, however, (in theory) be calculated by “walking back”
and comparing the close to the adjusted close. If they're different, check
to see if it's because of a dividend. If so, adjust the OHL by the dividend
(as a % of the pre-dividend close to avoid negative prices) and keep going.
If not, calculate/store the split ratio, and adjust the OHL. Proceed this
way until the end of the data set is reached. However, due to the
combination of potential rounding-losses/loss-of-precision when using
percents and there occasionally being errors in the data Yahoo provides,
such an approach is potentially error-prone and finding a way to parse split
dates/ratios from some website would probably be a much better way to go
about calculating adjusted prices... I need to some more research into this,
as well as actually try to get the aforementioned algorithm implemented to
see how reliable it actually is.
Support for merging/updating: Historical data from Yahoo doesn't include
the latest close until about 13-14 hours prior to the next open for that
time period, so there needs to be logic for merging the most recent close
from the appropriate shorter primary time period into each >=daily primary
time period. Furthermore, the minimum number of historical days/weeks/months
that can be downloaded at a time is 10, so there also needs to be logic for
merging existing local data with potentially overlapping downloaded data.
(low priority)
Public interface for historical dividends and splits, based upon a symbol,
start and end date. [The most recent dividend and split ratio is available
from the symbol/infos interface above.]
Well, I guess that's about what I think we'd eventually want. But it's also
probably *really* ambitious... at least to complete all of that within 3
months. But, I also have no intention of stopping after mid-August so, yea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.kde.org/pipermail/kde-finance-apps/attachments/20100521/e0054893/attachment-0001.htm
More information about the Kde-finance-apps
mailing list