4

Given a set of (time-series) data, how to interpret the data in such a way that it is increasing/decreasing, not steady, unchanged, etc.

Year  Revenue
1993     0.85
1994     0.99
1995     1.01
1996     1.12
1997     1.25
1998     1.36
1999     1.28
2000     1.44
2
  • I'm not sure if this is really python or pandas related. How do you define, increasing, decreasing, not steady, unchanged? How would you solve this without pandas? Maybe is better suited for crossvalidated Mar 21, 2017 at 7:21
  • 1
    pandas, sure can perform time series analysis, however, you still need to define how you would identify a trend. For example, you simply perform a linear regression on you values and use the slope as indicator of trend strength. However, typically, the less data you have the more volatile such a trend is. Additionally, you may want to discover trend changes, thus the context of time becomes important. Time series analysis is not so simple, however, pandas and numpy can help you there Mar 21, 2017 at 7:33

2 Answers 2

17

you can use numpy.polyfit, you can provide order as Degree of the fitting polynomial.

Refer:numpy.polyfit documentation

import numpy as np
import pandas as pd

def trendline(data, order=1):
    coeffs = np.polyfit(data.index.values, list(data), order)
    slope = coeffs[-2]
    return float(slope)

#Sample Dataframe
revenue = [0.85, 0.99, 1.01, 1.12, 1.25, 1.36, 1.28, 1.44]
year = [1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000]

# check if values are exactly same
if (len(set(revenue))) <= 1:
    print(0)
else:
    df = pd.DataFrame({'year': year, 'revenue': revenue})

    slope = trendline(df['revenue'])
    print(slope)

so now if the value of the slope is +ve the trend is increasing, if it is 0 trend is constant, else decreasing

In your given data slope is 0.0804761904762. So, the trend is increasing

Update: trendline fails in case of exactly constant value, you can add custom check (len(set(revenue))) <= 1 to verify, if that is the case return 0.

5
  • 2
    Hi, if i set constant value to revenue say revenue =[200,200,200,200,200,200,200,200]. i get negative output . According to you it should be 0. Can you clarify
    – Indraneel
    Aug 30, 2017 at 9:36
  • why does this answer have so many upvotes? it does not seem correct to me
    – Snow
    Aug 16, 2018 at 12:54
  • @Indraneel Regarding -ve slope since we are trying polynomial fit, It will not work in case of constant data, you can have a simple check if len(set(listChar))==1 return 0.
    – Ashish
    May 21, 2020 at 20:55
  • @Snow It is not a very correct way to do prediction or forecast, but you can certainly check for the current trend with that.
    – Ashish
    May 21, 2020 at 21:00
  • @Ashish it looks like you are returning the intercept rather than the slope with slope = coeffs[-2] ? np.polyfit returns [intercept, slope] while numpy.polynomial.polynomial.polyfit returns [slope, intercept] Jul 29, 2021 at 7:47
8

if you sort the dataframe by 'Year'

df.sort_values('Year', inplace=True)

You can then observe the pd.Series attributes
df.Revenue.is_monotonic
df.Revenue.is_monotonic_decreasing
df.Revenue.is_monotonic_increasing

Not the answer you're looking for? Browse other questions tagged or ask your own question.