Back to Projects
🔋

Renewable Energy Investment Analysis

A Statistical Investigation into the Drivers of Project Success using Microsoft Excel

📅 Duration
May 2024 – July 2024 (3 months)
👤 Project Type
Individual Research Project
💻 Tools & Technologies
Excel, Statistical Analysis, Data Visualization
📊 Dataset Size
15,000 records | 13 variables

Project Overview

In the global push for renewable energy, it is often assumed that larger financial investments directly lead to greater job creation and better environmental outcomes. This project was designed to test that fundamental assumption through a rigorous statistical analysis of over 15,000 renewable energy project records.

Using the advanced analytical capabilities of Microsoft Excel, including the Data Analysis ToolPak, I performed a series of hypothesis tests to examine the relationships between funding, project type, and key performance indicators. The results were surprising: the analysis found no significant statistical link between investment levels and their expected impacts, providing a crucial, data-driven insight for policymakers and investors.

Key Research Questions:

  • Investment Efficacy: Does higher funding automatically translate to better project outcomes?
  • Job Creation Hypothesis: Can we statistically prove that investment levels directly correlate with employment generation?
  • Policy Implications: What factors should policymakers prioritize beyond capital allocation?
  • Strategic Resource Allocation: How can governments and investors maximize economic and environmental impact?

Strategic Significance:

This analysis challenges fundamental assumptions in renewable energy policy, providing evidence-based insights that can prevent misguided investments and guide more effective resource allocation strategies. The findings demonstrate that correlation does not equal causation, and sometimes, there isn't even a correlation.

Project Artifacts

The complete analysis and findings are documented in the following files. The Word document provides the full narrative and interpretation, while the Excel file contains all the raw data, calculations, PivotTables, and charts.

Research Questions

  1. In which renewable sector is the government investing more, and what is the proportion of government investment to job creation?
  2. What is the relationship between grid integration level and energy production/consumption?
  3. How does government investment relate to installed capacity and production?

Dataset Overview

Source: Kaggle Renewable Energy Dataset

Description: Contains 15,000 records with 13 variables on renewable energy systems, including installed capacity, energy production, consumption, storage, investment, and environmental impact.

Key Variables:

  • Type of Renewable Energy (Coded: 1-Solar, 2-Wind, etc.)
  • Installed Capacity (MW)
  • Energy Production (MWh/year)
  • Energy Consumption (MWh/year)
  • Energy Storage Capacity (MWh)
  • Storage Efficiency (%)
  • Grid Integration Level (Coded: 1-Fully Integrated, etc.)
  • Initial Investment (USD)
  • Funding Sources (Coded: 1-Government, 2-Private, etc.)
  • Financial Incentives (USD)
  • GHG Emission Reduction (tCO2e)
  • Air Pollution Reduction (Index)
  • Jobs Created

Technical Implementation

Tools & Technologies:

  • Microsoft Excel: Primary analysis platform with Data Analysis ToolPak for advanced statistical functions
  • Statistical Tests: T-tests, ANOVA, MANOVA, Chi-squared tests, Odds Ratio calculations
  • Data Visualization: Excel charts and graphs for exploratory data analysis and results presentation
  • Data Validation: IQR method, box plots, and Z-scores for outlier detection

Analysis Workflow:

  1. Data Import & Validation: Imported 15,000 records and validated data integrity
  2. Exploratory Data Analysis: Generated descriptive statistics and visualizations
  3. Hypothesis Formulation: Developed testable hypotheses based on business questions
  4. Statistical Testing: Applied appropriate tests for each hypothesis
  5. Results Interpretation: Analyzed p-values, effect sizes, and practical significance
  6. Documentation: Maintained detailed records of all analytical decisions and rationale

Quality Assurance:

All analytical steps were documented with clear rationale for methodological choices. Statistical assumptions were verified, and multiple validation approaches were used to ensure robust conclusions.

Methodology: A Rigorous Approach in Excel

This project demonstrates how Microsoft Excel can be used as a powerful tool for end-to-end statistical analysis, proving that sophisticated research can be conducted without expensive specialized software.

Data Validation & Cleaning:

  • The 15,000-record dataset was checked for missing values and duplicates (none found)
  • Outlier detection was performed using the IQR method via box plots and Z-scores to ensure data integrity
  • Data types were confirmed and variables standardized for comparative analysis

Exploratory Data Analysis (EDA):

  • Descriptive Statistics were generated to summarize all quantitative variables
  • PivotTables and PivotCharts were used extensively to explore relationships and distributions
  • Visual analysis included the count of projects by grid integration level and investment patterns

Hypothesis Testing:

  • Five distinct hypotheses were formulated to address the core research questions
  • The Data Analysis ToolPak in Excel was used to perform T-tests, ANOVA, and MANOVA
  • Chi-squared tests and Odds Ratios were calculated manually using formulas to assess associations between categorical variables
  • All tests were conducted at a 95% confidence level (α = 0.05)

Key Methodological Decision:

The dataset was treated as a sample rather than a complete population due to the near-uniform distribution of quantitative variables, which is uncharacteristic of full population datasets. This assumption was crucial for proper statistical inference.

Key Findings & Visualizations

The analysis consistently found that the observed differences in project outcomes were likely due to random chance rather than a true underlying relationship with investment. This is your visual evidence demonstrating Excel's analytical power:

Slide 1: Descriptive Statistics Overview

Descriptive Statistics
Strategic Insight: An initial summary of the data, highlighting the near-uniform distribution of variables that informed the decision to treat the dataset as a sample. This foundational analysis shaped all subsequent statistical interpretations.

Slide 2: Grid Integration Distribution Analysis

Grid Integration Levels
Strategic Finding: EDA showed an even distribution of projects across different integration levels. However, ANOVA testing later revealed this factor had no significant impact on energy production (p = 0.35), challenging assumptions about infrastructure importance.

Key Statistical Finding: Investment vs. Job Creation

Critical Discovery: T-test analysis revealed no statistically significant difference (p = 0.81) in the number of jobs created between high-investment and low-investment government projects. This directly challenges the fundamental assumption that more funding equals more jobs, providing crucial evidence for policy reform.

Hypothesis Testing Results & Insights

Five core hypotheses were rigorously tested using appropriate statistical methods. The results consistently challenged conventional assumptions about renewable energy project success factors.

Hypothesis Testing Results:

Hypothesis 1: Higher investment levels lead to greater job creation

Test Used: ANOVA

Result: Not Statistically Significant (p = 0.81)

Business Insight: The data shows no evidence that simply increasing a project's budget guarantees more jobs. Job creation is likely influenced by other factors, such as the type of energy (e.g., solar vs. wind) or the project's operational scale.

Hypothesis 2: Government investment correlates with specific energy types

Test Used: Chi-squared Test

Result: Not Statistically Significant (p = 0.47)

Business Insight: Government funding appears to be distributed relatively evenly across renewable energy types, suggesting a diversified investment strategy rather than sector-specific bias.

Hypothesis 3: Grid integration level affects energy production efficiency

Test Used: ANOVA

Result: Not Statistically Significant (p = 0.35)

Business Insight: Current grid integration levels do not show significant differences in production outcomes, indicating that factors beyond integration status drive energy efficiency.

Hypothesis 4: Funding source influences project scale

Test Used: Odds Ratio Analysis

Result: Statistically Significant but Small Effect (OR = 0.96)

Business Insight: Government-funded projects show a slightly lower likelihood of being large-scale, suggesting private investment may be more concentrated in major infrastructure projects.

Hypothesis 5: Grid integration affects combined production and consumption metrics

Test Used: MANOVA

Result: Not Statistically Significant (p = 0.69)

Business Insight: Even when examining multiple outcome variables simultaneously, grid integration level does not demonstrate significant impact on operational performance.

Challenges & Learnings

A key challenge was interpreting the near-uniform distribution of quantitative variables, which led to the assumption that the dataset was a sample rather than a complete population. This project reinforced the importance of rigorous data validation and hypothesis testing to avoid drawing conclusions from patterns that may be due to random chance.

Conclusion & Strategic Recommendation

The most significant takeaway from this project is that correlation does not equal causation, and sometimes, there isn't even a correlation. The lack of statistical significance across all major hypotheses indicates that investment size is not the primary lever for success in this dataset.

Recommendation for Stakeholders:

Policymakers should avoid strategies that rely solely on increasing investment amounts. Instead, resources should be directed towards identifying and funding projects based on other, more influential factors, such as technological maturity, geographical advantages, and specific policy supports, which were not captured in this dataset but are suggested as critical by this analysis.

Business Impact & Value:

  • Policy Reform: Evidence-based guidance for government funding strategies, potentially redirecting billions in investment toward more effective approaches
  • Risk Management: Helps investors avoid the false security of assuming larger budgets guarantee better returns
  • Strategic Planning: Provides foundation for developing more sophisticated evaluation frameworks beyond simple funding metrics
  • Future Research Direction: Identifies critical gaps requiring additional data collection (geography, technology maturity, policy environment)

Methodological Excellence:

This project showcases the power of fundamental statistical testing to provide valuable, non-obvious insights that can prevent misguided investments and lead to more effective policy-making. By demonstrating rigorous analysis capabilities using accessible tools like Excel, it proves that sophisticated research doesn't always require expensive specialized software.

Key Professional Achievement: Successfully disproved a widely-held assumption through statistical rigor, delivering actionable insights that challenge conventional thinking and provide strategic value for renewable energy stakeholders.