CMU 10-714 DL System(2024 Fall)

资料参考:

食用方式

阅读lecture pdf + youtube课程视频

完成对应的hw

Gradient Descend - Optimization

Newton’s Method

Momentum

Adam

SGD

Resnet

HW2 - Layer Norm VS Batch Norm

Layer Normalization对向量的最后一个维度进行正则化(正则化所有特征),Batch Normalization对不同sample输入进行正则化(正则化每一个特征)。相当于两者正则的维度不同。

  • Batch Normalization (paper):
    • aim: reduce internal covariate shift
    • advantage:
      • training stability
      • higher learning rate
    • disadvantage:
      • batch size dependent - not suitable for small batch size
      • training and testing distribution can be different
      • difficult to apply to RNN
  • Layer Normalization (paper):
    • aim: deal with rnn (straightforward or consistent batching)
    • advantage:
      • not batch size dependent

可以理解为,Layer Normalization是对batch normalization的优化?

HW3 - Matmul GPU acceleration

basic parallelism

shared memory

pybind问题

在使用colab中的T4 GPU make编译时,出现了以下关于pybind报错:

/usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/common.h(724): error: excessive recursion at instantiation of class “pybind11::detail::make_index_sequence_impl<13UL, 13UL, 14UL, 15UL, 16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>”
struct make_index_sequence_impl : make_index_sequence_impl<N - 1, N - 1, S…> {};
^
detected during:
instantiation of class “pybind11::detail::make_index_sequence_impl<N, S…> [with N=14UL, S=<14UL, 15UL, 16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>]” at line 724
instantiation of class “pybind11::detail::make_index_sequence_impl<N, S…> [with N=15UL, S=<15UL, 16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>]” at line 724
instantiation of class “pybind11::detail::make_index_sequence_impl<N, S…> [with N=16UL, S=<16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>]” at line 724
instantiation of class “pybind11::detail::make_index_sequence_impl<N, S…> [with N=17UL, S=<17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>]” at line 724
instantiation of class “pybind11::detail::make_index_sequence_impl<N, S…> [with N=18UL, S=<18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>]” at line 724
[ 195 instantiation contexts not shown ]
instantiation of type “pybind11::detail::make_index_sequence<213UL>” at line 56 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/descr.h
instantiation of “pybind11::detail::descr<, Ts1…, Ts2…> pybind11::detail::operator+(const pybind11::detail::descr<N1, Ts1…> &, const pybind11::detail::descr<N2, Ts2…> &) [with N1=1UL, N2=213UL, Ts1=<>, Ts2=needle::cuda::CudaArray]” at line 485 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “void pybind11::cpp_function::initialize(Func &&, Return ()(Args…), const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray , std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]" at line 265 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of "pybind11::cpp_function::cpp_function(Return (
)(Args…), const Extra &…) [with Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]” at line 1384 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “pybind11::module_ &pybind11::module_::def(const char *, Func &&, const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Extra=<>]” at line 628 of /content/drive/MyDrive/10714/hw3/src/ndarray_backend_cuda.cu

/usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/common.h(724): error: incomplete type “pybind11::detail::make_index_sequence_impl<13UL, 13UL, 14UL, 15UL, 16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>” is not allowed
struct make_index_sequence_impl : make_index_sequence_impl<N - 1, N - 1, S…> {};
^
detected during:
instantiation of class “pybind11::detail::make_index_sequence_impl<N, S…> [with N=14UL, S=<14UL, 15UL, 16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>]” at line 724
instantiation of class “pybind11::detail::make_index_sequence_impl<N, S…> [with N=15UL, S=<15UL, 16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL>]” at line 724
[ 195 instantiation contexts not shown ]
instantiation of type “pybind11::detail::make_index_sequence<213UL>” at line 56 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/descr.h
instantiation of “pybind11::detail::descr<, Ts1…, Ts2…> pybind11::detail::operator+(const pybind11::detail::descr<N1, Ts1…> &, const pybind11::detail::descr<N2, Ts2…> &) [with N1=1UL, N2=213UL, Ts1=<>, Ts2=needle::cuda::CudaArray]” at line 485 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “void pybind11::cpp_function::initialize(Func &&, Return ()(Args…), const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray , std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]" at line 265 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of "pybind11::cpp_function::cpp_function(Return (
)(Args…), const Extra &…) [with Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]” at line 1384 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “pybind11::module_ &pybind11::module_::def(const char *, Func &&, const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Extra=<>]” at line 628 of /content/drive/MyDrive/10714/hw3/src/ndarray_backend_cuda.cu

/usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/common.h(730): error: class “pybind11::detail::make_index_sequence_impl<213UL>” has no member “type”
using make_index_sequence = typename make_index_sequence_impl::type;
^
detected during:
instantiation of type “pybind11::detail::make_index_sequence<213UL>” at line 56 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/descr.h
instantiation of “pybind11::detail::descr<, Ts1…, Ts2…> pybind11::detail::operator+(const pybind11::detail::descr<N1, Ts1…> &, const pybind11::detail::descr<N2, Ts2…> &) [with N1=1UL, N2=213UL, Ts1=<>, Ts2=needle::cuda::CudaArray]” at line 485 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “void pybind11::cpp_function::initialize(Func &&, Return ()(Args…), const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray , std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]" at line 265 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of "pybind11::cpp_function::cpp_function(Return (
)(Args…), const Extra &…) [with Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]” at line 1384 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “pybind11::module_ &pybind11::module_::def(const char *, Func &&, const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Extra=<>]” at line 628 of /content/drive/MyDrive/10714/hw3/src/ndarray_backend_cuda.cu

/usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/descr.h(56): error: no instance of function template “pybind11::detail::plus_impl” matches the argument list
argument types are: (const pybind11::detail::descr<1UL>, const pybind11::detail::descr<213UL, needle::cuda::CudaArray>, pybind11::detail::index_sequence<0UL>, )
return plus_impl(a, b, make_index_sequence(), make_index_sequence());
^
/usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/detail/descr.h(45): note #3327-D: candidate function template “pybind11::detail::plus_impl” failed deduction
constexpr descr<N1 + N2, Ts1…, Ts2…> plus_impl(const descr<N1, Ts1…> &a,
^
detected during:
instantiation of “pybind11::detail::descr<, Ts1…, Ts2…> pybind11::detail::operator+(const pybind11::detail::descr<N1, Ts1…> &, const pybind11::detail::descr<N2, Ts2…> &) [with N1=1UL, N2=213UL, Ts1=<>, Ts2=needle::cuda::CudaArray]” at line 485 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “void pybind11::cpp_function::initialize(Func &&, Return ()(Args…), const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray , std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]" at line 265 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of "pybind11::cpp_function::cpp_function(Return (
)(Args…), const Extra &…) [with Return=void, Args=<size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t>, Extra=<pybind11::name, pybind11::scope, pybind11::sibling>]” at line 1384 of /usr/local/lib/python3.11/dist-packages/pybind11/include/pybind11/pybind11.h
instantiation of “pybind11::module_ &pybind11::module_::def(const char *, Func &&, const Extra &…) [with Func=void (&)(size_t, needle::cuda::scalar_t, needle::cuda::CudaArray *, std::vector<int32_t, std::allocator<int32_t>>, std::vector<int32_t, std::allocator<int32_t>>, size_t), Extra=<>]” at line 628 of /content/drive/MyDrive/10714/hw3/src/ndarray_backend_cuda.cu

4 errors detected in the compilation of “/content/drive/MyDrive/10714/hw3/src/ndarray_backend_cuda.cu”.
CMake Error at ndarray_backend_cuda_generated_ndarray_backend_cuda.cu.o.cmake:280 (message):
Error generating file
/content/drive/MyDrive/10714/hw3/build/CMakeFiles/ndarray_backend_cuda.dir/src/./ndarray_backend_cuda_generated_ndarray_backend_cuda.cu.o

make[3]: *** [CMakeFiles/ndarray_backend_cuda.dir/build.make:605: CMakeFiles/ndarray_backend_cuda.dir/src/ndarray_backend_cuda_generated_ndarray_backend_cuda.cu.o] Error 1
make[3]: Leaving directory ‘/content/drive/MyDrive/10714/hw3/build’
make[2]: *** [CMakeFiles/Makefile2:121: CMakeFiles/ndarray_backend_cuda.dir/all] Error 2
make[2]: Leaving directory ‘/content/drive/MyDrive/10714/hw3/build’
make[1]: *** [Makefile:91: all] Error 2
make[1]: Leaving directory ‘/content/drive/MyDrive/10714/hw3/build’
make: *** [Makefile:9: lib] Error 2

没有办法解决,TODO