cv/index.html

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>

    </title>
    <link rel="stylesheet"
        href="https://cdnjs.cloudflare.com/ajax/libs/github-markdown-css/5.5.1/github-markdown-light.min.css"
        integrity="sha512-Pmhg2i/F7+5+7SsdoUqKeH7UAZoVMYb1sxGOoJ0jWXAEHP0XV2H4CITyK267eHWp2jpj7rtqWNkmEOw1tNyYpg=="
        crossorigin="anonymous" referrerpolicy="no-referrer" />
    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.16.12/dist/katex.min.css" integrity="sha384-PDbUeNCuE6bOPudPOgFyIUEy3UJawJVwr3XlGO90FIuf5qNIoTLSgOJo/dC2ZXV/" crossorigin="anonymous">

    <!-- The loading of KaTeX is deferred to speed up page rendering -->
    <script defer src="https://cdn.jsdelivr.net/npm/katex@0.16.12/dist/katex.min.js" integrity="sha384-VkqWq8xtm5YQk1BBXczQ8/Sx+DlCzF8cuS43bZwmtVXzRFtyLTqTCdP7MKmKo+KN" crossorigin="anonymous"></script>

    <!-- To automatically render math in text elements, include the auto-render extension: -->
    <script defer src="https://cdn.jsdelivr.net/npm/katex@0.16.12/dist/contrib/auto-render.min.js" integrity="sha384-hCXGrW6PitJEwbkoStFjeJxv+fSOOQKOPbJxSfM6G5sWZjAyWhXiTIIAmQqnlLlh" crossorigin="anonymous"
        onload="renderMathInElement(document.body, {delimiters: [{ left: '$$',  right: '$$',  display: false }]});"></script>
    <style>
        .markdown-body {
            box-sizing: border-box;
            min-width: 200px;
            max-width: 980px;
            margin: 0 auto;
            padding: 45px;
        }

        @media (max-width: 767px) {
            .markdown-body {
                padding: 15px;
            }
        }
    </style>
</head>

<body>
    <article class="markdown-body">
        <h1>miti99's CV</h1>
        <ul>
        <li>Email: <a href="mailto:john.doe@email.com">john.doe@email.com</a></li>
        <li>Location: San Francisco, CA</li>
        <li>Website: <a href="https://rendercv.com/">rendercv.com</a></li>
        <li>LinkedIn: <a href="https://linkedin.com/in/rendercv">rendercv</a></li>
        <li>GitHub: <a href="https://github.com/rendercv">rendercv</a></li>
        </ul>
        <h1>Welcome to RenderCV</h1>
        <p>RenderCV reads a CV written in a YAML file, and generates a PDF with professional typography.</p>
        <p>See the <a href="https://docs.rendercv.com">documentation</a> for more details.</p>
        <h1>Education</h1>
        <h2><strong>Princeton University</strong>, Computer Science</h2>
        <p><strong>PhD</strong></p>
        <p>Princeton, NJ</p>
        <p>Sept 2018 – May 2023</p>
        <ul>
        <li>
        <p>Thesis: Efficient Neural Architecture Search for Resource-Constrained Deployment</p>
        </li>
        <li>
        <p>Advisor: Prof. Sanjeev Arora</p>
        </li>
        <li>
        <p>NSF Graduate Research Fellowship, Siebel Scholar (Class of 2022)</p>
        </li>
        </ul>
        <h2><strong>Boğaziçi University</strong>, Computer Engineering</h2>
        <p><strong>BS</strong></p>
        <p>Istanbul, Türkiye</p>
        <p>Sept 2014 – June 2018</p>
        <ul>
        <li>
        <p>GPA: 3.97/4.00, Valedictorian</p>
        </li>
        <li>
        <p>Fulbright Scholarship recipient for graduate studies</p>
        </li>
        </ul>
        <h1>Experience</h1>
        <h2><strong>Nexus AI</strong>, Co-Founder &amp; CTO</h2>
        <p>San Francisco, CA</p>
        <p>June 2023 – present</p>
        <p>2 years 9 months</p>
        <ul>
        <li>
        <p>Built foundation model infrastructure serving 2M+ monthly API requests with 99.97% uptime</p>
        </li>
        <li>
        <p>Raised $18M Series A led by Sequoia Capital, with participation from a16z and Founders Fund</p>
        </li>
        <li>
        <p>Scaled engineering team from 3 to 28 across ML research, platform, and applied AI divisions</p>
        </li>
        <li>
        <p>Developed proprietary inference optimization reducing latency by 73% compared to baseline</p>
        </li>
        </ul>
        <h2><strong>NVIDIA Research</strong>, Research Intern</h2>
        <p>Santa Clara, CA</p>
        <p>May 2022 – Aug 2022</p>
        <p>4 months</p>
        <ul>
        <li>
        <p>Designed sparse attention mechanism reducing transformer memory footprint by 4.2x</p>
        </li>
        <li>
        <p>Co-authored paper accepted at NeurIPS 2022 (spotlight presentation, top 5% of submissions)</p>
        </li>
        </ul>
        <h2><strong>Google DeepMind</strong>, Research Intern</h2>
        <p>London, UK</p>
        <p>May 2021 – Aug 2021</p>
        <p>4 months</p>
        <ul>
        <li>
        <p>Developed reinforcement learning algorithms for multi-agent coordination</p>
        </li>
        <li>
        <p>Published research at top-tier venues with significant academic impact</p>
        </li>
        <li>
        <p>ICML 2022 main conference paper, cited 340+ times within two years</p>
        </li>
        <li>
        <p>NeurIPS 2022 workshop paper on emergent communication protocols</p>
        </li>
        <li>
        <p>Invited journal extension in JMLR (2023)</p>
        </li>
        </ul>
        <h2><strong>Apple ML Research</strong>, Research Intern</h2>
        <p>Cupertino, CA</p>
        <p>May 2020 – Aug 2020</p>
        <p>4 months</p>
        <ul>
        <li>
        <p>Created on-device neural network compression pipeline deployed across 50M+ devices</p>
        </li>
        <li>
        <p>Filed 2 patents on efficient model quantization techniques for edge inference</p>
        </li>
        </ul>
        <h2><strong>Microsoft Research</strong>, Research Intern</h2>
        <p>Redmond, WA</p>
        <p>May 2019 – Aug 2019</p>
        <p>4 months</p>
        <ul>
        <li>
        <p>Implemented novel self-supervised learning framework for low-resource language modeling</p>
        </li>
        <li>
        <p>Research integrated into Azure Cognitive Services, reducing training data requirements by 60%</p>
        </li>
        </ul>
        <h1>Projects</h1>
        <h2><strong><a href="https://github.com/">FlashInfer</a></strong></h2>
        <p>Jan 2023 – present</p>
        <p>Open-source library for high-performance LLM inference kernels</p>
        <ul>
        <li>
        <p>Achieved 2.8x speedup over baseline attention implementations on A100 GPUs</p>
        </li>
        <li>
        <p>Adopted by 3 major AI labs, 8,500+ GitHub stars, 200+ contributors</p>
        </li>
        </ul>
        <h2><strong><a href="https://github.com/">NeuralPrune</a></strong></h2>
        <p>Jan 2021</p>
        <p>Automated neural network pruning toolkit with differentiable masks</p>
        <ul>
        <li>
        <p>Reduced model size by 90% with less than 1% accuracy degradation on ImageNet</p>
        </li>
        <li>
        <p>Featured in PyTorch ecosystem tools, 4,200+ GitHub stars</p>
        </li>
        </ul>
        <h1>Publications</h1>
        <h2><strong>Sparse Mixture-of-Experts at Scale: Efficient Routing for Trillion-Parameter Models</strong></h2>
        <p>July 2023</p>
        <p><em>John Doe</em>, Sarah Williams, David Park</p>
        <p><a href="https://doi.org/10.1234/neurips.2023.1234">10.1234/neurips.2023.1234</a> (NeurIPS 2023)</p>
        <h2><strong>Neural Architecture Search via Differentiable Pruning</strong></h2>
        <p>Dec 2022</p>
        <p>James Liu, <em>John Doe</em></p>
        <p><a href="https://doi.org/10.1234/neurips.2022.5678">10.1234/neurips.2022.5678</a> (NeurIPS 2022, Spotlight)</p>
        <h2><strong>Multi-Agent Reinforcement Learning with Emergent Communication</strong></h2>
        <p>July 2022</p>
        <p>Maria Garcia, <em>John Doe</em>, Tom Anderson</p>
        <p><a href="https://doi.org/10.1234/icml.2022.9012">10.1234/icml.2022.9012</a> (ICML 2022)</p>
        <h2><strong>On-Device Model Compression via Learned Quantization</strong></h2>
        <p>May 2021</p>
        <p><em>John Doe</em>, Kevin Wu</p>
        <p><a href="https://doi.org/10.1234/iclr.2021.3456">10.1234/iclr.2021.3456</a> (ICLR 2021, Best Paper Award)</p>
        <h1>Selected Honors</h1>
        <ul>
        <li>
        <p>MIT Technology Review 35 Under 35 Innovators (2024)</p>
        </li>
        <li>
        <p>Forbes 30 Under 30 in Enterprise Technology (2024)</p>
        </li>
        <li>
        <p>ACM Doctoral Dissertation Award Honorable Mention (2023)</p>
        </li>
        <li>
        <p>Google PhD Fellowship in Machine Learning (2020 – 2023)</p>
        </li>
        <li>
        <p>Fulbright Scholarship for Graduate Studies (2018)</p>
        </li>
        </ul>
        <h1>Skills</h1>
        <p><strong>Languages:</strong> Python, C++, CUDA, Rust, Julia</p>
        <p><strong>ML Frameworks:</strong> PyTorch, JAX, TensorFlow, Triton, ONNX</p>
        <p><strong>Infrastructure:</strong> Kubernetes, Ray, distributed training, AWS, GCP</p>
        <p><strong>Research Areas:</strong> Neural architecture search, model compression, efficient inference, multi-agent RL</p>
        <h1>Patents</h1>
        <ol>
        <li>
        <p>Adaptive Quantization for Neural Network Inference on Edge Devices (US Patent 11,234,567)</p>
        </li>
        <li>
        <p>Dynamic Sparsity Patterns for Efficient Transformer Attention (US Patent 11,345,678)</p>
        </li>
        <li>
        <p>Hardware-Aware Neural Architecture Search Method (US Patent 11,456,789)</p>
        </li>
        </ol>
        <h1>Invited Talks</h1>
        <ol>
        <li>
        <p>Scaling Laws for Efficient Inference — Stanford HAI Symposium (2024)</p>
        </li>
        <li>
        <p>Building AI Infrastructure for the Next Decade — TechCrunch Disrupt (2024)</p>
        </li>
        <li>
        <p>From Research to Production: Lessons in ML Systems — NeurIPS Workshop (2023)</p>
        </li>
        <li>
        <p>Efficient Deep Learning: A Practitioner's Perspective — Google Tech Talk (2022)</p>
        </li>
        </ol>
        <h1>Any Section Title</h1>
        <p>You can use any section title you want.</p>
        <p>You can choose any entry type for the section: <code>TextEntry</code>, <code>ExperienceEntry</code>, <code>EducationEntry</code>, <code>PublicationEntry</code>, <code>BulletEntry</code>, <code>NumberedEntry</code>, or <code>ReversedNumberedEntry</code>.</p>
        <p>Markdown syntax is supported everywhere.</p>
        <p>The <code>design</code> field in YAML gives you control over almost any aspect of your CV design.</p>
        <p>See the <a href="https://docs.rendercv.com">documentation</a> for more details.</p>
    </article>
</body>

</html>