我正在分析一个数据集,如下所示
#Latitude Longitude Depth [m] Bathy depth [m] CaCO3 [%] ...
-78 -177 0 693 1
-78 -173 0 573 2
.
.
计划是将数据读入一个向量,然后将其分解为不同的“水深”组(测深深度)。它还需要同时划分为不同的海盆。例如,北大西洋的所有数据点,也就是水深在500-1500米,1000-2000米,1500-2500米之间。。。应该在它自己的组中(可能是一个向量或另一个对象)。我们的想法是能够将这些输出到不同的文本文件中。
我试图用一种有点结巴的方式。你可以在下面看到这个
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <string>
#include <GeographicLib/Geodesic.hpp> //Library which allows for distance calculations
using namespace std;
using namespace GeographicLib;
//Define each basin spatially
//North Atlantic
double NAtlat1 = 1, NAtlong1 = 1, NAtlat2 = 2, NAtlong2=2; //Incorrect values, to be set later
//South Atlantic
double SAtlat1 = 1, SAtlong1 = 1, SAtlat2 = 2, SAtlong2=2;
//North Pacific and the all others...
struct Point
{
//structure Sample code/label--Lat--Long--SedimentDepth[m]--BathymetricDepth[m]--CaCO3[%]...
string dummy;
double latitude, longitude, rockDepth, bathyDepth, CaCO3, fCaCO3, bSilica, Quartz, CO3i, CO3c, DCO3;
string dummy2;
//Use Overload>> operator
friend istream& operator>>(istream& inputFile, Point& p);
};
//Overload operator
istream& operator>>(istream& inputFile, Point& p)
{
string row_text;
//Read line from input, store in row_text
getline(inputFile, row_text);
//Extract line, store in ss row_stream
istringstream row_stream(row_text);
//Read-in data into each variable
row_stream >> p.dummy >> p.latitude >> p.longitude >> p.rockDepth >> p.bathyDepth >> p.CaCO3 >> p.fCaCO3 >> p.bSilica >> p.Quartz >> p.CO3i >> p.CO3c >> p.DCO3 >> p.dummy2;
return inputFile;
}
int main ()
{
//Geodesic class.
const Geodesic& geod = Geodesic::WGS84();
//Input file
ifstream inputFile("Data.csv");
//Point-type vector to store all data
vector<Point> database;
//bathDepth515 = depths between 500m and 1500m
vector<Point> bathyDepth515;
vector<Point> bathyDepth1020; //Create the rest too
Point p;
if (inputFile)
{
while(inputFile >> p)
{
database.push_back(p);
}
inputFile.close();
}
else
{
cout <<"Unable to open file";
}
for(int i = 0; i < database.size(); ++i)
{
//Group data in database in sets of bathyDepth
if(database[i].bathyDepth >= 500 && database[i].bathyDepth < 1500)
{
//Find and fill particular bathDepths
bathyDepth515.push_back(database[i]);
}
if(database[i].bathyDepth >= 1000 && database[i].bathyDepth < 2000)
{
bathyDepth1020.push_back(database[i]);
}
//...Further conditional statements based on Bathymetric depth, could easily include a spatial condition too...
//Calculate distance between point i and all other points within 500-1500m depth window. Do the same with all other windows.
for(int i = 0; i < bathyDepth515.size(); ++i)
{
for(int j = 0; j < bathyDepth515.size(); ++j)
{
double s12;
if(i != j)
{
geod.Inverse(bathyDepth515[i].latitude, bathyDepth515[i].longitude, bathyDepth515[j].latitude, bathyDepth515[j].longitude, s12);
}
}
}
return 0;
}
问题1:
我认为很明显,有些方法不是面向对象的。例如,可能有更好的方法来分配每个数据
Point
到一个特定的海洋盆地,而不是在我的程序开始时手动放入这些,就像我在按深度分组数据时所做的那样。我开始创建一个basin类,其中包含检索位置的方法,以及每个basin的lat/long的定义,但是没有找到一种直观的方法。我想有人给我一个如何更好地建立这个想法。我尝试建立一个(非常脆弱的)类如下
class Basin
{
public:
Basin();
Basin(double latit1, double longit1, double latit2, double longit2);
double getLatitude();
...
private:
double NAt, SAt, NPac, SPac; //All basins
double latitude1, longitude1, latitude2, longitude2; // Boundaries defined by two latitude markers, and two longitude markers.
};
class NAt: public Basin{...}
//...Build class definitions...
我的第二个关注点是为不同深度窗口创建向量的方法。如果我不得不改变分割深度的方式,或者添加更多的深度,这可能会变得非常麻烦。我不想为了适应我决定滑动深度窗口的方式而改变我的程序的几乎所有部分。如果有人能给我一些建议,我会很感激的。
//Vector of vectors to store each rows' entries separately. Not sure if this is needed as my `Point` `Struct` already allows for each of the entries in a row to be accessed.
vector<vector<Point> > database;
//Create database vectors for different bathDepth windows, eventhough their names will be the same, once push_back, it doesn't matter
for(int i = 1*500; i<=8*500; i+=500)
{
vector<Point> bathDepthi;
//Possibly push these back into database. These can then be accessed and populated using if statements.
}
//Create vectors individually, creating vector of vectors would be more elegant as I won't have to create more if one were to decide to change the range, I'd just have to change the for loop.
我不知道我在这方面做了多少努力,但我想我也许能更好地理解我的意图。为这么长的帖子道歉。
编辑-设计切换自
std::vector
至
std::map
根据heke发布的答案,我尝试了以下方法,但我不确定这是否是用户的意思。我选择使用条件语句来确定一个点是否在一个特定的盆地内。如果我的尝试是正确的,我仍然不确定如何继续如果这是正确的。更具体地说,我不知道如何存储以访问分区向量,比如说,将它们写入单独的文本文件(即不同水深的.txt文件)。我应该将分区迭代器存储到一个向量中吗?如果是,是什么类型的?迭代器声明为
auto
,这使我对如何声明向量的类型来容纳这个迭代器感到困惑。
我的尝试:
std::map<std::string, std::vector<Point> > seamap;
seamap.insert( std::pair<std::string, std::vector<Point> > ("Nat", vector<Point>{}) );
seamap.insert( std::pair<std::string, std::vector<Point> > ("Sat", vector<Point>{}) ); //Repeat for all ocean basins
Point p;
while (inputFile >> p && !inputFile.eof() )
{
//Check if Southern Ocean
if (p.latitude > Slat2)
{
//Check if Atlantic, Pacific, Indian...
if (p.longitude >= NAtlong1 && p.longitude < SAtlong2 && p.latitude > SPLIT)
{
seamap["Nat"].push_back(p);
} // Repeat for different basins
}
else
{
seamap["South"].push_back(p);
}
}
//Partition basins by depth
for ( std::map<std::string, std::vector<Point> >::iterator it2 = seamap.begin(); it2 != seamap.end(); it2++ )
{
for ( int i = 500; i<=4500; i+=500 )
{
auto itp = std::partition( it2->second.begin(), it2->second.end(), [&i](const auto &a) {return a.bathyDepth < i;} );
}
}